zlib experts, please..

From: Linus Torvalds <torvalds@osdl.org>
Date: 2005-07-06 09:57:55
I just hit an interesting pack failure because of how git (mis-)uses zlib, 
and I'm wondering what to do about it.

In particular, the "git-unpack-objects" code gets a data stream, and only
knows the _unpacked_ size of each object, because writing packed size is
extremely inconvenient in many ways (let me count the ways.. At pack time,
we want to fill in the size field before we've even packed things, and at
unpack time, we really don't care about the packed size, but we _do_ care
about the unpacked size in order to be able to allocate the right sized
allocation for the result).

However, it turns out that there's a silly special case: a zero-sized 
"blob" object will ne encoded as a single byte "0x30" followed by the 
"packed representation of empty".

Now, you'd expect the packed representation of empty to be empty, but 
that's not apparently what zlib does. It actually seems to pack zero bytes 
as 8 bytes of "78 9c 03 00 00 00 00 01". Which is fine, I don't care, with 
git this will literally happen for only one single object, so it's not 
like I care about the expansion.

But what I care about is that when git-unpack-objects sees that it wants a 
zero-byte object, and asks zlib to unpack it, zlib will not actually use 
the bytes it wrote - it will just say "oh, you wanted zero bytes, here's 
zero bytes". Which means that the stream handling gets upset.

Now, I can easily fix this by just teaching the packing code that it 
should pack the zero-byte object as zero bytes, and not let zlib mess it 
up. In fact, I've done exactly that. However, now I worry that there's 
some other case where zlib uncompression doesn't eat everything that zlib 
compression generated. I've not found it, and I think a zero sized case is 
special (it's kind of like a "break" event), but this is yet another cry 
for zlib expertise in case somebody knows...

To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Received on Sat Jul 09 18:21:25 2005

This archive was generated by hypermail 2.1.8 : 2005-07-09 18:21:27 EST