Re: [PATCH] gitweb: use correct mime type even if filename has multiple dots.

From: Junio C Hamano <junkio@cox.net>
Date: 2006-09-17 18:41:40
Martin Waitz <tali@admingilde.org> writes:

>> > -     $filename =~ /\.(.*?)$/;
>> > -     return $mimemap{$1};
>> 
>> Actually, that is non-greedy match, so the above code insist that 
>> extension starts at the _last_ dot.
>
> hmm, but it didn't work for me.
> I had filenames like "man/program.8.html" which got served as
> "text/html" with the old code.

It based its decision on '.html' part, which is expected from
non-greedy match (if I were writing the pattern, I would have
written /\.[^.]+$/ instead, though).  Are you trying to have it
behave differently between "x.8.html" and "x.html"?

> Looking at /etc/mime.types, it only contains pcf.Z but perhaps
> it should also contain tar.gz or similiar.

Probably.  But that makes me think it might be better to:

 - read in mime.types, sort the entries with length of the
   suffixes (longer first);

 - try matching the suffixes from longer to shorter and pick the
   first match.

Without that, you would not be able to cope with a /etc/mime.types
that looks like this, no?

        application/a	a
        application/b	b.a

Perhaps something like the attached.

---
diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index a81c8d4..4994a33 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -1140,23 +1140,42 @@ sub mimetype_guess_file {
 	my $filename = shift;
 	my $mimemap = shift;
 	-r $mimemap or return undef;
+	local($_);
 
-	my %mimemap;
-	open(MIME, $mimemap) or return undef;
+	open MIME, $mimemap
+	    or return undef;
+
+	# Under mod_perl caching this may make a lot of sense...
+	my @mime = ();
+	my $maxlen = 0;
 	while (<MIME>) {
-		next if m/^#/; # skip comments
-		my ($mime, $exts) = split(/\t+/);
-		if (defined $exts) {
-			my @exts = split(/\s+/, $exts);
-			foreach my $ext (@exts) {
-				$mimemap{$ext} = $mime;
+		next if /^#/;
+		chomp;
+		my ($mimetype, @ext) = split(/\s+/);
+		for (@ext) {
+			my $len = length;
+			my $map = $mime[$len];
+			if (!$map) {
+				$mime[$len] = $map = {};
+				$maxlen = $len if ($maxlen < $len);
 			}
+			# We could detect duplicate definition here... i.e.
+			# onetype	ext
+			# anothertype	ext
+			$map->{$_} = $mimetype;
 		}
 	}
-	close(MIME);
+	close MIME;
 
-	$filename =~ /\.(.*?)$/;
-	return $mimemap{$1};
+	for ($filename) {
+		for (my $len = $maxlen; 0 < $len; $len--) {
+			my $map = $mime[$len];
+			while (my ($ext, $type) = each %$map) {
+				return $type if (/\.\Q$ext\E$/);
+			}
+		}
+	}
+	return undef;
 }
 
 sub mimetype_guess {

-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Received on Sun Sep 17 18:43:53 2006

This archive was generated by hypermail 2.1.8 : 2006-09-17 18:44:36 EST