Re: CAREFUL! No more delta object support!

From: Junio C Hamano <junkio@cox.net>
Date: 2005-06-28 19:40:56
>>>>> "LT" == Linus Torvalds <torvalds@osdl.org> writes:

LT> But the savings get increasingly bigger the more history we have. That's
LT> also why the packed git archive is about 1/14th of the size of the fully
LT> unpacked disk usage of the git project,...

GIT archive may be an odd-ball because the project itself is so
small, but a fair comparison should include the disk usage of
256 fan-out directories.  Counting them, empty .git/objects/
with the 1.4MB packed archive and 90KB index file ends up being
somewhere around 2.4MB on my machine, compared with 17MB for the
traditional one.

Still a good space reduction.  Good job!

I am now dreaming if we someday would enhance the mechanism with
append-only updates to the *.pack files with complete rewrite of
the *.idx files, and get rid of files under .git/objects totally.

This would make things reasonably friendly to rsync.  The kernel
pack has around 60M pack with 1.1M index, so everyday use would
involve incremental updates to the pack [*1*] and full download
of the index file.

[Footnote]

*1* Presumably many objects are deltified against older objects
which is suboptimal.  Most likely the newer objects are accessed
far more often and they are what we would want to keep in full
not as delta.  So even with this scheme we would want to have
weekly repacking.  Interestingly enough, pack-objects gets the
objects via usual read_sha1_file() interface so it can produce a
new pack from an existing pack.


-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Received on Tue Jun 28 19:41:26 2005

This archive was generated by hypermail 2.1.8 : 2005-06-28 19:41:29 EST