Re: Hash collision count

From: Petr Baudis <pasky@ucw.cz>
Date: 2005-04-24 09:46:37
Dear diary, on Sun, Apr 24, 2005 at 01:20:21AM CEST, I got a letter
where Jeff Garzik <jgarzik@pobox.com> told me that...
> Second, in your scenario, it's highly unlikely you would get 4 billion 
> sha1 hash collisions, even if you had the disk space to store such a git 
> database.

It's highly unlikely you would get a _single_ collision.

> First, the hash is NOT unique.
> 
> Second, you lose data if you pretend it is unique.  I don't like losing 
> data.

*sigh*

We've been through this before, haven't we?

> Third, a data check only occurs in the highly unlikely case that a hash 
> already exists -- a collision.  Rather than "trillions of times", more 
> like "one in a trillion chance."

No, a collision is pretty common thing, actually. It's the main power of
git, actually - when you do read-tree, modify it and do write-tree
(typically when doing commit), everything you didn't modify (99% of
stuff, most likely) is basically a collision - but it's ok since it
just stays the same.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
C++: an octopus made by nailing extra legs onto a dog. -- Steve Taylor
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Received on Sun Apr 24 09:46:45 2005

This archive was generated by hypermail 2.1.8 : 2005-04-24 09:46:45 EST