Re: Is cogito really this inefficient

From: Linus Torvalds <torvalds@osdl.org>
Date: 2005-07-15 01:51:44
On Thu, 14 Jul 2005, Russell King wrote:
> 
> Actually, I should've left the sh -x /usr/bin/cg-diff drivers/serial/8250.c
> running a little longer.  It's not the git-update-cache command which
> is taking the time, it's git-diff-cache.

Ok. git-diff-cache actually ends up reading your HEAD tree, and that, in
turn, is 1000+ tree objects. So it can take a while for the whole tree,
especially in the nonpacked and uncached case.

git-diff-tree (comparing two trees) is smart enough to limit itself to 
just the sub-trees that have been named, and would have compared the two 
trees by looking up just eight objects (three subdirectories from each 
tree, and then the file itself from both trees). 

But git-diff-cache isn't - because it's comparing the tree against the
index file, and the index is inevitably the whole tree.

And I now think I know what makes it slow. Not only are you basically
opening 1100 files (the tree objects - there's really that many
subdirectories in the kernel. Scary), but because you have alternate
object directories, and almost all of the objects are in the alternate
(not your primary), you'll basically always end up _first_ looking in the
primary, failing, and then looking in the alternate.

Together with the hashing, you'll be looking all over the place, in other
words ;)

Which means that you'll be needing a fair amount of memory to keep all of
those negative dentries etc cached (and the directory tree too).

This is something the pack-files will just help enormously with, but it
was only recently that we turned git around to check the pack-files
_first_, and the object directories second, so you probably won't see it
(not to mention that you probably don't have big pack-files at all ;)

I'll look into making diff-cache be more efficient. I normally don't use
it myself, so I didn't bother (I use git-diff-files, which is way more
efficient, but doesn't show the difference against the _tree_, it shows
the difference against the index. Since cogito tries to hide the index
from you, cogito can't very well use that).

			Linus
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Received on Fri Jul 15 01:52:10 2005

This archive was generated by hypermail 2.1.8 : 2005-07-15 01:52:11 EST