Re: Kernel headers git tree

From: David Woodhouse <dwmw2@infradead.org>
Date: 2006-07-14 20:23:46
On Thu, 2006-07-13 at 22:16 -0700, Linus Torvalds wrote:
> So to give an example, let's say we have
> 
>         HEAD  ->     A
>                     / \
>                    B   C
>                   / \   \
>                  D   E   F
>                   \ /   / \
>                    G   H  I
>                   .......
> 
> the difference between --topo-order and the default ordering for
> 
>         git rev-list HEAD
> 
> is most visible for commit 'G'.
> 
> For --topo-order, we guarantee that before we show 'G', we _will_ have 
> shown both 'D' and 'E'. In other words, --topo-ordering guarantees that it 
> shows _all_ children before it shows the parent.

Ah, OK. Then it should probably be fine. I'll talk myself through it...

We're building a parallel graph of commits, containing a _subset_ of the
commits in the master tree -- only those which touch certain files. 

For each 'interesting' commit X, we create a corresponding commit X' in
the slave tree -- we create the corresponding tree object, and we also
recursively create its parent commits -- replacing each parent in the
original commit X with the slave-tree equivalent of the closest
_interesting_ ancestor commit. It's that "closest interesting ancestor"
which we're finding with the 'rev-list --max-count-1 -- myfile'
invocation. 

The extract-khdrs-stage2.sh script is a simple example of this, and
differs from the other script mostly in the way that it creates the
_tree_ objects.

So working from your example above, and assuming that only commits I and
E actually change the files we care about. This means that merges A, B
and F are _also_ going to show up in the output of 'rev-list -- myfile'.

So the slave tree will look like this:

        A'
       / \
      B'  F'
      |   |
      E'  I'

The interesting case, if I'm trying to convince myself that my 'slave'
tree is always going to have the correct topology, is when a merge
commit is _missing_ from the rev-list output -- for example, if commits
D and E in your original tree both make the _same_ change, then I
believe that the merge commit B will no longer show up, because 'myfile'
is identical in B and in both of its parents.

In that case, we accept that the representation isn't going to be
perfect -- the left-hand parent of A' is going to appear to be _either_
D' or E', but not B'. In fact, since D' and E' are _identical_ as far as
we're concerned, it doesn't really matter which is chosen. The other one
of the two becomes an unused branch with no children -- we end up with a
graph looking like this. 

      A'
     / \
D'  E'  F'
  \/    |
        I'

... and the parent of D' and E' is the closest ancestor of G which
actually touches the files we care about, of course.

All we care about, in this case, is that the first commit listed by
rev-list is _either_ D or E, and not something further down the tree.
And that's obviously true from your description of the 'weak ordering',
so yes -- it does look like I can drop the '--topo-order'. Thanks.

(It would actually be quite nice if I _could_ find a cheap way to
include commit B' in that final example, but it's such a rare case and
it would be so expensive to do it that I don't think it's worth
pursuing.)

-- 
dwmw2

-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Received on Fri Jul 14 20:24:46 2006

This archive was generated by hypermail 2.1.8 : 2006-07-14 20:25:21 EST