>>>>> "LT" == Linus Torvalds <torvalds@osdl.org> writes: LT> Namely that read-tree doesn't have a frigging clue about renames, and LT> shouldn't have. LT> But a real merge program _could_ have a frigging clue, and might notice LT> patterns like LT> - file got modified in one branch, removed in the other LT> - a file got added in the other branch LT> - "Hey, that added file looks like the removed one!" LT> - Let's merge the modifications from the first branch into the move of LT> the second branch! LT> Now, you can (validly) argue that you could still just look at the LT> original trees ("git-diff-tree -C $O $M") and grep for copies/movement and LT> do it by hand _there_ instead of looking at the result of the read-tree, LT> and you may well be right. So again, this is not a _fundamental_ problem, LT> although it's a bit more fundamental than the first one. My knee-jerk reaction was "No, I would refuse to make that argument, because making the merge mechanism to examine trees itself would take us full-circle back to where we started [*1*]". I agree we can, as the zeroth order approximation, run two "diff-tree -B --find-copies-harder -C" [*3*, *4*] on (O,A) and (O,B) pairs, and compare their output to cover the rename case [*2*] you described. I think we also can write a simple program that reads an unmerged index file and do the equivalent of these two diff-tree commands. However, what I suspect to happen in practice is that the lines of development leading to A and B may have so much modification to those renamed or copied files since they forked at O that we may not recognize renames or copies as such by only looking at (O,A) and (O,B). In order to do a reasonable job while merging, we may end up needing to run "diff-tree --stdin -B -C" on the output of "rev-list O A" to fully follow the rename/copy trail [*5*]. What all this means is that the simple three-stage information read-tree -m gives us, which is about only three trees, might not be enough to handle renames and copies intelligently, when we need to deal with a pair of trees that have diverged for too long. Once we go down this path, arguing against making "read-tree -m" results useless for such an intelligent merge logic (because it forces the merge logic to look at the trees and commits involved) ceases to make much sense, because such an intelligent merge logic needs to look at more than three trees _anyway_. What "read-tree -m" gives us, while being very efficient, elegant and effective in "merge small and merge often" use pattern we recommend, may not be so useful to implement such an intelligent merge logic, and instead we would do better if we did it the hard way by inspecting individual commits. I do not have problem with that approach. It would be a much longer-term project, though. So, yes I ended up arguing that the intelligent merge logic could and probably needs to look at the trees involved ;-). Among the three-way cases, the only case I think that may make a practical difference is the case #5ALT, which deals with "a file added identically in both branches" case. This is what happens when a widely accepted patch has been applied independently to both trees recently (eh, "since they forked"). New files tend to get updated more often, and allowing the file to be locally modified, instead of failing the merge in read-tree phase, would help the workflow. If the file were modified in the user's repository, and checked in, then the current 3-way merge code cannot help the user that much, because we would be in !O && A && B && A!=B situation. I have a suspicion that we could probably help this case by looking at not just merge base but the edge commits as well. I consider #14ALT an improvement, but at the same time I doubt that particular one would make much practical difference. It is more or less "while we are at it" kind of change. All others, including the "remove" cases (I botched -u but as you point out it is correctable), do not contribute to loosening the index requirements, but I suspect they might help me later unify two-way fast forward and three-way merge. Yes, I am still looking at "read-tree -m H I-mixed-with-H M" that emulates "read-tree H M". [Footnotes] *1* Remember merge-trees Perl script, which I did before you invented the multi-stage read-tree? Boy it feels like it was so distant past... *2* A casual reader may notice that we are arguing about renames after both of us publicly stated that "renames do not matter". Here is a clarification. We both consider "recording renames at commit time" does not matter, but we do take "tracking and handling the renames" seriously. There is a difference. *3* Oops. There is not --find-copies-harder yet ;-). *4* This would be further helped if we had a --show-rename-only diffcore filter. The operation is similar to the pickaxe, but it would prune changesets down only to renames and copies. I actually wrote and threw away such a filter back when I was trying to find good test cases in linux-2.6 repository. *5* And the development line leading to A or B may not even be linear, in which case it may be easier to first decompose the chain between O and A into individual epochs. Jon Seymour's "rev-list --merge-order O A" would be very handy for this. - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.htmlReceived on Fri Jun 10 06:55:27 2005
This archive was generated by hypermail 2.1.8 : 2005-06-10 06:55:58 EST