Marcus Comstedt (ACROSS) (Hail Ilpalazzo!) @ Pike (-) developers forum wrote:
How much of a file is covered by each "partial hash"? It seems to me that even if only 40% of the file is changed, you might still get at least one changed byte in each part.
The algorithms are tuned to handle typical source files and corresponding changesets. In order to give an accurate answer I'd have to check the source of git a bit (even though I messed around with git source a bit, I haven't reviewed this part of the source yet).
But it will cause mismatch (wrt previous logic) in some cases, otherwise it wouldn't fix the encountered problem. The fact that it doesn't change the behaviour for previously fixed cases doesn't mean that it doesn't change the behaviour for some cases which _didn't_ need fixing before.
In theory yes, however, the algorithms involved are not that convoluted and quite straightforward, and they behave predictably when handling typical text-source-files (which is a lot easier than being predictable on *any* type of file), so changing them is not as chaotic as it seems, given enough sane reviewers checking the algorithm enhancements.