Re: CAREFUL! No more delta object support!

From: Christopher Li <git@chrisli.org>
Date: 2005-06-28 20:38:52
That is all nice improvement to address the space usage issue.

Should people just run repacking once a while or is it automaticly
add new object to the pack file?

Chris


On Mon, Jun 27, 2005 at 08:30:22PM -0700, Linus Torvalds wrote:
> 
> Deltas do exist inside pack-files, yes. They just don't exist as 
> independent objects any more, so you can never get into the situation that 
> you find a delta but you don't find the delta it points to.
> 
> Because in the pack-files, there are only deltas _within_ a pack-file. You 
> can't have a delta that points to outside the pack.
> 
> This means that pack-files with few objects will inevitably be larger than
> they could otherwise be (ie you can never have a pack file that _only_
> contains deltas to the outside world), but it's just incredibly reassuring 
> to me that a pack-file is always self-sufficient. 
> 
> So when/if we start using pack-files for doing "git pull" etc, the 
> pack-file won't actually help pack things for small updates: small updates 
> will probably contain the whole changed file, unless the update has 
> several changes to the same file (which is not unusual, of course), in 
> which case it will only contain one version and then deltas from that.
> 
> But the savings get increasingly bigger the more history we have. That's
> also why the packed git archive is about 1/14th of the size of the fully
> unpacked disk usage of the git project, but a packed kernel archive "only"  
> achieves a packing rate of 1/5th of the fully unpacked kernel archive. The
> git archive is all history, while the kernel archive just "appears", and
> 2/3 of the files have only one single version and thus don't delta-
> compress at all.
> 
> (Another reason is probably that the kernel has bigger files, which means
> that it thus has relatively less loss in filesystem block padding).
> 
> But not having any outside deltas not only makes me feel safer, it also
> means that you can fully validate a pack archive consistency without even
> knowing what project it is from - you can check the SHA1 results of every
> file in the pack against the index of the pack, and check that the SHA1's
> of the pack files themselves are valid. Again, this is just a data
> _consistency_ check, of course - it means that you can validate that it
> downloaded fine, and that you don't have disk corruption, but it doesn't
> mean that the data isn't evil and nasty and buggy ;)
> 
> 			Linus
> -
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Received on Wed Jun 29 00:04:22 2005

This archive was generated by hypermail 2.1.8 : 2005-06-29 00:04:25 EST