On Mon, 12 Dec 2005, Linus Torvalds wrote: > > As mentioned, pack-objects.c needs to check the size heuristics before > doing diff_delta() _anyway_, for performance reasons as well as simply > because the secondary use of diff_delta() is to estimate how big the > delta is, and it's always pointless to generate a delta that is > guaranteed to be bigger than the file (which is always the case with > either side being an empty file - the size difference will inevitably > be bigger than the size of the resulting file). Side note: this isn't technically entirely true. A binary diff that has a source file that is empty could in theory be smaller than the destination file simply because it may involve a certain amount of automatic compression in the form of "insert 100 spaces" kind of diff encoding. I'm not sure whether xdelta actually does something like that, but it's certainly possible at least in theory. Of course, even if the delta in such a case may be smaller than the resulting file, such a delta is still not interesting: even from a packing angle, if the resulting file has patterns that makes it easy to generate a small delta against an empty file, the fact is, such a regular end result will _compress_ better than the delta will, assuming any decent compression mechanism. So from a packing standpoint, generating the delta is still the wrong thing to do - you're better off with just compressing the undeltified result. And from a similarity-estimation standpoint, going from an empty file to anything else is also obviously not interesting either. An empty file cannot be "similar" to anything else (except perhaps another empty file, and even that is a matter of taste). I just wanted to correct the technicality that delta's can certainly be smaller than the result at least if the delta format allows for that kind of encoding. Linus - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.htmlReceived on Tue Dec 13 13:06:48 2005
This archive was generated by hypermail 2.1.8 : 2005-12-13 13:06:55 EST