Re: Change set based shallow clone

From: Linus Torvalds <torvalds@osdl.org>
Date: 2006-09-10 03:33:03
On Sat, 9 Sep 2006, Marco Costalba wrote:
> 
> Perhaps is total idiocy but why do not implement the fix-up logic
> directly in git-rev-list?

It's possible in theory, but in practice it's not worth it.

Why?

Because you really want the _asynchronous_ nature of having a separate 
user, that only shows _partial_ results.

In other words, we could reasonably easily make git-rev-list do something 
like

 - output all revisions in the normal non-topological ordering

 - when git-rev-list notices a topological sort error, it outputs a 
   "FIXME" line, and restarts the whole thing - printing out the commits 
   in the newly fixed ordering - and does it right the next time around

 - it then continues doing this until it's totally exhausted the whole 
   commit list and has done one final output in the proper topological 
   ordering.

Possible? Yes.

BUT:

 - as long as git-rev-list is entirely single-threaded (and trust me, it's 
   going to be that, because otherwise we'd be better off doing it in a 
   separate process - like gitk), this means that it will be _entirely_ 
   unaware of what has actually been _shown_, so it will restart a LOT 
   more than the external interactive process would do. So it would be 
   much worse than doing it externally and knowing what you've actually 
   shown to the user (if you haven't shown the bad thing yet, there's no 
   reason to restart).

 - Again, as long as it's single-threaded, git-rev-list will block once it
   has filled up the pipeline between the processes, so instead of parsing 
   everything in parallel with the "show it all", if will synchronize with 
   the showing process all the time, and especially so when it needs to 
   re-show the stuff that it already sent once. So it's also fairly 
   inefficient.

However, what you seem to imply is something different:

> Where, while git-rev-list is working _whithout sorting the whole tree
> first_, when finds an out of order revision stores it in a fixup-list
> buffer and *at the end* of normal git-rev-lsit the buffer is flushed
> to receiver, so that the drawing logic does not change and the out of
> order revisions arrive at the end, already packed, sorted and prepared
> by git-rev-list.

But this is exactly what we already do. We flush things *at the end* 
because that's when we actually know the ordering. And that's exactly why 
"git-rev-list --topo-ordering" has a latency ranging from a few seconds to 
a few minutes for large projects (depending on whether they are packed or 
not).

The "wait for the end" is _not_ good, exactly because the end will take 
some time to arrive. The whole point is to start outputting the data 
early, and thet BY DEFINITION means that the order of revisions isn't 
guaranteed to be in topological order.

		Linus
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Received on Sun Sep 10 03:33:38 2006

This archive was generated by hypermail 2.1.8 : 2006-09-10 03:34:14 EST