Re: [PATCH] Try URI quoting for embedded TAB and LF in pathnames

From: Linus Torvalds <torvalds@osdl.org>
Date: 2005-10-12 01:17:25
On Mon, 10 Oct 2005, Paul Eggert wrote:
> 
> An issue I hadn't really had time to think about is the character
> encoding of file names.

Please don't. Use filenames as if they are just binary blobs of data, 
that's the only thing that has a high chance of success. Yes, it too can 
break in the presense of something _else_ doing character translation 
and/or people moving a patch from one encoding to another , buthat's 
just true of anything.

Eventually everybody will hopefully use UTF-8, and nothing else really 
matters, but the thing is, if you see filenames as just blobs of data, 
that works with UTF-8 too, so it's not "wrong" even in the long run. And 
until everybody has one single encoding, you simply won't be able to tell, 
and the likelihood that you'd screw up is pretty high.

The happy part of the "binary blob" approach is that users _understand_ 
it. People who actively use different encoding formats are (painfully) 
aware of conversions, and they may curse you for not doing the random 
encoding format of the day, but they will be able to handle it.

In contrast, if you start doing conversions, I guarantee you that people 
will _not_ be able to handle it when you do something strange - you've 
changed the data.

Personally, I'd like the normal C quoting the best. Leave space as-is, and 
quote TAB/NL as \t and \n respectively. It's pretty universally understood 
in programming circles even outside of C, and it's not like a very 
uncommon patch format like that really needs to be well-understood outside 
of those circles.

It also has a very obvious and ASCII-safe format for other characters (ie 
just the normal octal escapes: \377 etc..

That said, I personally don't think it's necessarily even worth it. If 
somebody wants to use names with tabs and newlines, is he really going to 
work with diffs? Or is it just a driver error?

			Linus
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Received on Wed Oct 12 01:21:28 2005

This archive was generated by hypermail 2.1.8 : 2005-10-12 01:21:31 EST