Re: git-rev-list: "--bisect" flag

From: Jon Seymour <jon.seymour@gmail.com>
Date: 2005-06-20 05:55:19
On 6/20/05, Linus Torvalds <torvalds@osdl.org> wrote:
> 
> 
> On Mon, 20 Jun 2005, Jon Seymour wrote:
> >
> > Would I be correct in stating that an intuitive  reason why your
> > algorithm is better than selecting the linear middle is the following:
> >
> > If you concentrate on testing merges, rather than non-merges, the
> > chances are you are going to eliminate N-times as many possible good
> > commits as if you pick a random commit, where N is the average fan-out
> > of the commit graph.
> 
> No. You really shouldn't concentrate on merges either.
> 
> The thing is, you do _not_ want to test as many commits as possible, or as
> few commits as possible.
> 
> This is _not_ a "try to eliminate as many commits as possible" problem.
> It's really one of trying to eliminate _half_ the commits. Not more, not
> less.
> 

Yep, ok, so the node you are looking for is one that can reach as
close to half of the rest of the graph as possible - that's what you
mean by half-way reachability. If the graph was N nodes deep (no
fan-out) that would be the literal middle. If it was N nodes wide
(e.g. fanout N), there is no good node and you basically have to test
everything since one test doesn't imply anything about the other N-1
nodes.

A typical commit graph is worse than O(log2 N) by a factor that is
determined by some measure of the parallel branching in the graph.

> 
> The reason my algorithm is so horrid (O(3)) is that you can't even cache
> the dang thing sanely: you can't optimize if by remembering how many
> interesting commits are reachable from one commit, and then doing "this
> commit reaches commits X and Y, so I can reach X.count + Y.count commits
> from it". That's just not true, because often (basically always) there is
> lots of common commits in X.count and Y.count..

I presume  you mean O(N^3)?
 
Will you accept a patch that can reduce the worst case cost to some
lower order? I have a hunch (conceivably wrong, of course) that it can
be done in O(N).

> > This compares with my naive literal middle algorithm (measurements
> > only for the first 619 commits):
> >
> > average(21), median(16), stdev(10.6), max(49), min(8).
> 
> Yes. I really don't believe you can do better than 12, unless you start
> depending on knowledge of the distribution of bugs.
> 

Agreed. 

jon.
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Received on Mon Jun 20 05:55:43 2005

This archive was generated by hypermail 2.1.8 : 2005-06-20 05:55:44 EST