Re: Unexpected behavior in git-rev-list

From: Peter Eriksen <s022018@student.dtu.dk>
Date: 2005-09-22 02:49:48
On Sun, Sep 18, 2005 at 07:58:50PM +0200, Peter Eriksen wrote:
...
> So my new challenge to myself: Given two commit objects A and B list all
> the tree and blob objects which are not in both A and B. 

$ git-diff-tree -t A B

> 
> After that I think writing a command which does the same as 
> 'cvs annotate' would be a good exercise.

Ok, I have a prototype.  The algorithm has three steps:

1) traverse the commit DAG in breadth first order
2) for each commit in 1) find the diff against HEAD
3) for each diff from 2) accumulate the lines and
   commit ids that latest affected the current HEAD
4) print the commit ids found in 3), one for each
   line in HEAD.   

The command is used like this:
$ git-annotate-script.sh Documentation/git.txt >blame
$ cat blame
51017101c7a308745ba3c04944457f1dc6a55780
51017101c7a308745ba3c04944457f1dc6a55780
3db6b224cf36748b969acdd96b9fb2de641cd641
51017101c7a308745ba3c04944457f1dc6a55780
51017101c7a308745ba3c04944457f1dc6a55780
...
51017101c7a308745ba3c04944457f1dc6a55780
51017101c7a308745ba3c04944457f1dc6a55780
51017101c7a308745ba3c04944457f1dc6a55780
51017101c7a308745ba3c04944457f1dc6a55780
$ paste -d ' ' blame - <Documentation/git.txt

The script runs in about 6 seconds on my machine.

Any comments? 

Regards,

Peter



diff --git a/git-annotate-bfs.pl b/git-annotate-bfs.pl
new file mode 100755
--- /dev/null
+++ b/git-annotate-bfs.pl
@@ -0,0 +1,35 @@
+#!/usr/bin/env perl
+
+# 1) Bredde-først-søgning
+# 2) For hvert commit A i 1) find diff(parent(A), HEAD)
+# 3) Byg anklage-tabellen og skriv den ud.
+
+use strict;
+use warnings;
+
+my $v0 = $ARGV[0];
+
+my @Q;     # BFS helper queue of commit ids.
+my %C;     # BFS helper colours table. C[commit id] = colour.  1=grey,
2=black.
+my $v;
+
+$C{v0} = 1;
+push @Q, $v0;
+while (@Q) {
+    $v = shift @Q;
+    #print "$v\n"; ### DEBUG
+    open(PARENTS, "git-rev-list --parents --max-count=1 $v |");
+    chomp(my $commits = <PARENTS>);
+    close PARENTS;
+    my @parents = split(' ', $commits);
+    shift @parents;
+    my $v1;
+    foreach $v1 (@parents) {
+        #print ".";  ## DEBUG
+        if (not defined($C{$v1})) {
+            $C{$v1} = 1;
+            push @Q, $v1;
+        }
+    }
+    print "$commits\n"
+}
diff --git a/git-annotate-diff.sh b/git-annotate-diff.sh
new file mode 100755
--- /dev/null
+++ b/git-annotate-diff.sh
@@ -0,0 +1,21 @@
+#!/bin/sh
+
+# 1) Bredde-først-søgning
+# 2) For hvert commit A i 1) find diff(parent(A), HEAD)
+# 3) Byg anklage-tabellen og skriv den ud.
+
+FILEPATH=$1
+
+while read COMMITS; do
+    COMMIT=${COMMITS:0:40}
+    echo blame $COMMIT
+    for PARENT in ${COMMITS#* }; do
+       # Hvis HEAD har ændret sig i forhold til $PARENT,
+       # skyldes det ikke $PARENT, men et senere commit,
+       # nemlig ${COMMITS:0:40}
+       DIFF=`git-diff-tree -r -m $PARENT $COMMIT $FILEPATH`
+       if [ -n "$DIFF" ]; then
+           git diff $PARENT HEAD $FILEPATH
+       fi
+    done
+done
diff --git a/git-annotate-script.sh b/git-annotate-script.sh
new file mode 100755
--- /dev/null
+++ b/git-annotate-script.sh
@@ -0,0 +1,11 @@
+#!/bin/sh
+
+FILEPATH=$1
+
+BLOB=`git-ls-files --stage | grep $FILEPATH | cut -c8-47`
+LENGTH=`git-cat-file blob $BLOB | wc -l`
+
+git-annotate-bfs.pl HEAD |   \
+#cut -c1-40 | git-diff-tree -r -m --stdin rev-list.c | grep "^[^:]" \
+git-annotate-diff.sh $FILEPATH | \
+git-annotate-table.pl $LENGTH
diff --git a/git-annotate-table.pl b/git-annotate-table.pl
new file mode 100755
--- /dev/null
+++ b/git-annotate-table.pl
@@ -0,0 +1,41 @@
+#!/usr/bin/env perl
+
+use strict;
+use warnings;
+
+
+# 1) Bredde-først-søgning
+# 2) For hvert commit A i 1) find diff(parent(A), HEAD)
+# 3) Byg anklage-tabellen og skriv den ud.
+
+my $len = $ARGV[0];
+
+my @T;     # Blame table. T[line number] = commit ids
+my $n = 0;
+my $blame;
+my $cln;
+my $state = "header";
+
+while (defined (my $line = <STDIN>) and $n < $len) {
+    if ($line =~ /^blame ([0-9a-fA-F]{40})/) { $blame = $1; $state =
"header"; }
+    elsif ($line =~ /^diff --git/) { $state = "header"; }
+    elsif ($line =~ /^@@ -\d+,\d+ \+(\d+),/) { $cln = $1-1; $state =
"chunks"; }
+    elsif ($state eq "chunks") {
+        if ($line =~ /^\+/) {
+            if (not defined $T[$cln]) { $T[$cln] = $blame; $n++; }
#print "$cln $blame\n"; }
+            $cln++;
+       }
+        elsif ($line =~ /^-/) { }
+        elsif ($line =~ /^ /) { $cln++; }
+        else {
+            print "line = $line\n";
+            print "state = $state\n";
+            print "cln = $cln\n";
+            die "I'm not supposed to read this line.";
+       }
+    }
+}
+
+foreach (@T) {
+    print "$_\n";
+}
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Received on Thu Sep 22 03:02:10 2005

This archive was generated by hypermail 2.1.8 : 2005-09-22 03:02:13 EST