Re: Delitifier broken (Re: diff-core segfault)

From: Linus Torvalds <torvalds@osdl.org>
Date: 2005-12-13 13:05:59
On Mon, 12 Dec 2005, Linus Torvalds wrote:
> 
>    As mentioned, pack-objects.c needs to check the size heuristics before 
>    doing diff_delta() _anyway_, for performance reasons as well as simply 
>    because the secondary use of diff_delta() is to estimate how big the 
>    delta is, and it's always pointless to generate a delta that is 
>    guaranteed to be bigger than the file (which is always the case with 
>    either side being an empty file - the size difference will inevitably 
>    be bigger than the size of the resulting file).

Side note: this isn't technically entirely true. A binary diff that has a 
source file that is empty could in theory be smaller than the destination 
file simply because it may involve a certain amount of automatic 
compression in the form of "insert 100 spaces" kind of diff encoding. I'm 
not sure whether xdelta actually does something like that, but it's 
certainly possible at least in theory.

Of course, even if the delta in such a case may be smaller than the 
resulting file, such a delta is still not interesting: even from a packing 
angle, if the resulting file has patterns that makes it easy to generate a 
small delta against an empty file, the fact is, such a regular end result 
will _compress_ better than the delta will, assuming any decent 
compression mechanism.

So from a packing standpoint, generating the delta is still the wrong 
thing to do - you're better off with just compressing the undeltified 
result.

And from a similarity-estimation standpoint, going from an empty file to 
anything else is also obviously not interesting either. An empty file 
cannot be "similar" to anything else (except perhaps another empty file, 
and even that is a matter of taste).

I just wanted to correct the technicality that delta's can certainly be 
smaller than the result at least if the delta format allows for that kind 
of encoding.

		Linus
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Received on Tue Dec 13 13:06:48 2005

This archive was generated by hypermail 2.1.8 : 2005-12-13 13:06:55 EST