Recently I wanted to know how well Git's pack files were doing at storing rather large JAR files. So I wrote the attached script to parse the output of `git verify-pack -v` and use that to determine how many bytes are needed for each revision of any given file. For example running it on builtin-blame.c: $ perl ../delta-sizes.pl builtin-blame.c Caching cache-cdc41646a9de201b06a936fc3bddcbd51aeb532c.v... Pack index cache created. builtin-blame.c 16660221... s 2 44 066dee74... s 1 62 176f51a4... 0 12797 ---------------------------------------- 3 revs 12 KiB 3 revs 12 KiB There are 3 revisions of this file, totalling 12 KiB in disk space within the pack files. One of those revisions uses 44 bytes and the other uses 62 bytes. Given that this includes the complete overhead (including the 20 byte OBJ_REF_DELTA header) we're talking about ~20 bytes of delta data in revision 16660221. Pretty good. :) Of course this only looks at a single blob object and does not take into account the tree and commit overheads for a given revision, but it does give a really good idea of what is going on. -- Shawn. - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
This archive was generated by hypermail 2.1.8 : 2006-11-21 16:41:40 EST