Re: [PATCH] pack-objects: reuse data from existing pack.

From: Nicolas Pitre <nico@cam.org>
Date: 2006-02-16 14:41:24
On Wed, 15 Feb 2006, Junio C Hamano wrote:

> When generating a new pack, notice if we have already the wanted
> object in existing packs.  If the object has a delitified
> representation, and its base object is also what we are going to
> pack, then reuse the existing deltified representation
> unconditionally, bypassing all the expensive find_deltas() and
> try_deltas() routines.
> 
> Also, when writing out such deltified representation and
> undeltified representation, if a matching data already exists in
> an existing pack, just write it out without uncompressing &
> recompressing.

Great !

> Without this patch:
> 
>     $ git-rev-list --objects v1.0.0 >RL
>     $ time git-pack-objects p <RL
> 
>     Generating pack...
>     Done counting 12233 objects.
>     Packing 12233 objects....................
>     60a88b3979df41e22d1edc3967095e897f720192
> 
>     real    0m32.751s
>     user    0m27.090s
>     sys     0m2.750s
> 
> With this patch:
> 
>     $ git-rev-list --objects v1.0.0 >RL
>     $ time ../git.junio/git-pack-objects q <RL
> 
>     Generating pack...
>     Done counting 12233 objects.
>     Packing 12233 objects.....................
>     60a88b3979df41e22d1edc3967095e897f720192
>     Total 12233, written 12233, reused 12177
> 
>     real    0m4.007s
>     user    0m3.360s
>     sys     0m0.090s
> 
> Signed-off-by: Junio C Hamano <junkio@cox.net>
> 
> ---
> 
>  * This may depend on one cleanup patch I have not sent out, but
>    I am so excited that I could not help sending this out first.
> 
>    Admittedly this is hot off the press, I have not had enough
>    time to beat this too hard, but the resulting pack from the
>    above passed unpack-objects, index-pack and verify-pack.

In fact, the resulting pack should be identical with or without this 
patch, shouldn't it?

FYI: I have list of patches to produce even smaller (yet still 
compatible) packs, or less dense ones but with much reduced CPU usage.  
All depending on a new --speed argument to git-pack-objects.  I've been 
able to produce 15-20% smaller packs with the same depth and window 
size, but taking twice as much CPU time to produce. Combined with your 
patch, one could repack the object store with the maximum compression 
even if it is expensive CPU wise, but any pull will benefit from it 
afterwards with no additional cost.

I only need to find some time to finally clean and re-test those 
patches...


Nicolas
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Received on Thu Feb 16 14:42:09 2006

This archive was generated by hypermail 2.1.8 : 2006-02-16 14:42:20 EST