Re: fast-import and unique objects.

From: Jon Smirl <>
Date: 2006-08-07 01:53:43
On 8/6/06, Jon Smirl <> wrote:
> This model has a lot of object duplication. I generated 949,305
> revisions, but only 754,165 are unique. I'll modify my code to build a
> hash of the objects it has seen and then not send the duplicates to
> fast-import. Those 195,140 duplicated objects may be what is tripping
> index-pack up.

New run is finished with duplicate removal.

Time to run is unchanged, still 2hrs. Run time is IO bound not CPU.
Pack file is 845MB instead of 934MB.
git-index-pack works now, it takes 4 CPU minutes to run.
Index file is 18MB.

So it looks like the first stage code is working. Next I need to
modify cvs2svn to keep track of the sha-1 through it's sorting process
instead of file:revision.

Jon Smirl
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to
More majordomo info at
Received on Mon Aug 07 01:54:21 2006

This archive was generated by hypermail 2.1.8 : 2006-08-07 01:54:55 EST