Daniel Barkalow wrote: > On Mon, 12 Sep 2005, Chuck Lever wrote: > > >>For now, we simply replace indpos with a cache cursor. Likely more >>changes will be needed after we successfully replace the cache array >>with an abstract data type. > > > The right order is probably to add the concept of a cache that isn't the > one that normal functions deal with, have read_cache_unmerged return such > a thing, call cc_init with that, and rip out all of the removal and > position adjustment code. Then read_tree won't care at all about the > internal structure of the cache type, and it can be replaced without any > problem. ok, i've done this. read_cache_unmerged now reads into a separate cache, and read-tree.c does the merge by moving the appropriate cache entries into the active cache. the linked list prototype is done, and works correctly. this validates the new cache cursor API. unfortunately because finding a name is now O(n), many things are slower than before (but i expected this would be the case for lists). the next step is to try out more sophisticated data types. we have three on the table so far: 1. linus' sparse hyperlist implementation. i suspect this will have the same bad performance characteristics as a standard linked list. 2. self-balancing trees. a splay tree is a good example. we can reduce the size of the tree by storing all stages of a name in each node. kernel source is about 18K files, which means we can find names in about 15 steps, on average. 3. hash table, hashing on ce->name. similar to a Python dictionary. with an 8 kilobucket hash table and a good hash function, we can store the kernel source, finding names in two or three steps on average. are there others? - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
This archive was generated by hypermail 2.1.8 : 2005-09-15 01:36:50 EST