Re: [PATCH] Use a hashtable for objects instead of a sorted list

From: Junio C Hamano <junkio@cox.net>
Date: 2006-02-12 13:46:09
Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> In a simple test, this brings down the CPU time from 47 sec to 22 sec.

I was planning to take Alexandre's patch, but the approach your
patch takes feels more correct -- it scales with the number of
objects you need to handle, instead of having fixed 256
hashbuckets.

BTW, your version dumped core in hashtable_index immediately
after I started "git-rev-list --objects HEAD".  How did you get
_any_ CPU time?

I am not sure expecting that object name pointers are always
(unsigned int *) aligned as your patch does is OK.  We may want
to have something like the attached patch on top of yours.

I am also interested to find out how much the rehashing you do
when you update obj_allocs to a larger value is costing.

Alexandle, if you have a chance, could you try Johannes' patch
on your workload to see if it works OK for you?

-- >8 --
[PATCH] do not assume object name pointers are uint aligned.

Also fix an obvious bug that caused it dump core at my first
attempt.  There might be others but I did not actively look for
them.

Signed-off-by: Junio C Hamano <junkio@cox.net>
---
diff --git a/object.c b/object.c
index 3259862..59e5e36 100644
--- a/object.c
+++ b/object.c
@@ -13,17 +13,24 @@ int track_object_refs = 1;
 
 static int hashtable_index(const unsigned char *sha1)
 {
-	unsigned int i = *(unsigned int *)sha1;
-	return (int)(i % obj_allocs);
+	int cnt;
+	unsigned int ix = *sha1++;
+
+	for (cnt = 1; cnt < sizeof(unsigned int); cnt++) {
+		ix <<= 8;
+		ix |= *sha1++;
+	}
+	return (int)(ix % obj_allocs);
 }
 
 static int find_object(const unsigned char *sha1)
 {
-	int i = hashtable_index(sha1);
+	int i;
 
 	if (!objs)
 		return -1;
 
+	i = hashtable_index(sha1);
 	while (objs[i]) {
 		if (memcmp(sha1, objs[i]->sha1, 20) == 0)
 			return i;


-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Received on Sun Feb 12 13:46:50 2006

This archive was generated by hypermail 2.1.8 : 2006-02-12 13:47:01 EST