Re: [PATCH 2/3] sha1_file: add the ability to parse objects in "pack file format"

From: Linus Torvalds <torvalds@osdl.org>
Date: 2006-07-12 08:09:07
On Tue, 11 Jul 2006, sf wrote:
> 
> And in the traditional format type and length are compressed whereas in
> the pack-file format they are not.

Ahh. Yes.

> > This should probably be applied to the main tree asap if we think
> > this is at all a worthwhile exercise. But somebody should verify that I 
> > got the format right first!
> 
> Sorry but see above.

Good catch, thanks indeed.

Doing that for unpacked objects would in fact make a lot of things much 
simpler, so it would be good to do. The _bad_ part is that this also makes 
it a lot harder to see the difference between a "binary header" and a 
"compressed ASCII header". The two are not "obviously different" any more.

The common byte sequence for a compressed stream is

	78 9c ...

where the first byte is the CMF byte (compression info and method).

But it's not the only possible such sequence according to the zlib format.

(The 16-bit hex number in MSB format, ie 0x789c above, is defined to have 
a built-in checksum, so that it must be a multiple of 31 according to the 
standard: 0x789c = 996 * 31).

So if we have a uncompressed header, we'd need to add a separate 2-byte 
fingerprint to it _before_ the real header that isn't divisible by 31, and 
use that as the thing to test.

Ho humm. I'll see what I can come up with.

		Linus
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Received on Wed Jul 12 08:09:52 2006

This archive was generated by hypermail 2.1.8 : 2006-07-12 08:10:21 EST