Repacking many disconnected blobs

From: Keith Packard <keithp@keithp.com>
Date: 2006-06-14 17:17:58
parsecvs scans every ,v file and creates a blob for every revision of
every file right up front. Once these are created, it discards the
actual file contents and deals solely with the hash values.

The problem is that while this is going on, the repository consists
solely of disconnected objects, and I can't make git-repack put those
into pack objects. This leaves the directories bloated, and operations
within the tree quite sluggish. I'm importing a project with 30000 files
and 30000 revisions (the CVS repository is about 700MB), and after
scanning the files, and constructing (in memory) a complete revision
history, the actual construction of the commits is happening at about 2
per second, and about 70% of that time is in the kernel, presumably
playing around in the repository.

I'm assuming that if I could get these disconnected blobs all neatly
tucked into a pack object, things might go a bit faster.
-- 
keith.packard@intel.com

-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Received on Wed Jun 14 17:19:11 2006

This archive was generated by hypermail 2.1.8 : 2006-06-14 17:19:32 EST