Re: The merge from hell...

From: Linus Torvalds <torvalds@osdl.org>
Date: 2006-02-05 07:59:59
On Sat, 4 Feb 2006, Linus Torvalds wrote:
> 
> Doing a 
> 
> 	git-rev-list --parents HEAD |
> 		egrep '^.{90}' |
> 		cut -d' ' -f1 | 
> 		git-diff-tree --pretty --cc --stdin
> 		| less -S
> 
> on the kernel is actually interesting. It's interesting because it shows 
> that out of 1391 merges, in the kernel, only _19_ actually had these close 
> calls. Some - but certainly not all - of them actually did need manual 
> fixup.

There are some doubly interesting lessons when I looked closer.

In particular, some merges that needed manual fixups do _not_ show up. I 
found that surprising, at first. I expected that if I had to fix something 
manually, it would obviously show up in the "--cc" output.

Not so.  In fact, the one I looked closer at didn't show up even in the 
"long" version, aka "-c".

The reason? A lot of the manual fixups end up selecting one version or the 
other - the clash is because two people fixes the same bug slightly 
differently, and the manual merge will end up just selecting one of them. 
So then even "-c" won't show it, because it will notice that the whole 
file was actually the same as one of the branches merged.

That may be a bit non-intuitive (maybe it shouldn't be, and it was just me 
who didn't think about it the right way when I was initially surprised), 
but it was definitely the right thing (both from a merge standpoint _and_ 
from a "what happened" standpoint) in the cases I looked at. The merge may 
have been manual, but the end _result_ was trivial, and thus isn't shown.

So even after looking at it more, and searching for "interesting" cases 
from the other side, I really like the current git-diff-tree --cc output. 
It sometimes shows you things you wouldn't expect, and it sometimes 
doesn't show you things you'd expect to show up, but in both cases it 
shows/avoids the _right_ things.

However, the point that a "diff" itself isn't totally unambigious is well 
taken. You're right that the very first hunk of the 12-way merge is really 
not interesting.

However, looking at the other cases, it seems to not really be a huge 
problem - that seems to be the only case in the whole kernel - and the 
git-diff-tree algorithm may show an unnecessary hunk once in a blue moon, 
but that's better than having the heuristics fail the other way around (ie 
not showing a hunk).

That's what the gitk problem was, btw (showing too little, not too much). 
Current gitk fails on this trivial case:

	mkdir test-merge
	cd test-merge/
	git-init-db 

	#
	# Initial silly contents
	#
	echo "Hello" > hello
	echo "Hi" > hi
	git add hello hi
	git commit -m "Initial"

	#
	# Create another branch
	#
	git branch other

	#
	# Edit the contents on the master branch,
	# commit it.
	#
	echo "Hello there" > hello
	git commit -m "first change" hello

	#
	# Edit/commit the other differently
	#
	git checkout other
	echo "Hello differently" > hello
	git commit -m "second change" hello

	#
	# Try to merge - this will fail
	#
	git checkout master
	git merge "Clashing merge" HEAD other

	#
	# Do an evil merge conflict that also edits a 
	# nonconflicting file
	#
	echo "Hello third version" > hello
	echo "Hidden hi change" > hi 
	git commit -m "Evil merge conflict" hello hi

At this point, "git-diff-tree --cc HEAD" shows exactly the right thing, 
but "gitk" doesn't show anything at all for that merge (it shows the 
"hello" file in the file pane, but no diff at all, and certainly not the 
"hi" file which it _should_ show).

		Linus
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Received on Sun Feb 05 08:00:50 2006

This archive was generated by hypermail 2.1.8 : 2006-02-05 08:01:00 EST