| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by jodrellblank 1639 days ago
	If you're going to move to non-ASCII characters, what about Unicode combining characters, do you have to care if your file named é is a single accented e or a pair of e-and-combining-accent before you can move it? If you shouldn't have to care, if there should be a layer of Unicode normalization happening, why is that okay but case normalisation is not okay? If you do have to care, then you no longer "know exactly what you get".

1 comments

hvdijk 1638 days ago

> If you do have to care, then you no longer "know exactly what you get".

You do know you get exactly what you put in, whether that is é (U+00E9) or é (U+0065 U+0301). When you reference files created by yourself, this is not a problem as realistically speaking, your input method will only have a convenient way of forming one of these, and it will consistently generate the same one every time. When you reference files created by someone else, this may be a problem but no more than e.g. the distinction between file.txt (lowercase L) and fiIe.txt (uppercase i): from the user's perspective, the problem is pretty much avoided by selecting the file using tab completion, TUIs, GUIs, whatever you use.

link

bayindirh 1638 days ago

> e.g. the distinction between file.txt (lowercase L) and fiIe.txt (uppercase i)...

Just want to remind a little but extremely important thing: There's absolutely no guarantees that upper("i") is I. There's at least one language breaks that convention, and you won't believe how a big headache is that.

link

hvdijk 1638 days ago

Good point, Turkish uppercases i to İ. This adds complexity to the idea of case insensitive file systems even when only considering ASCII.

link

jodrellblank 1638 days ago

https://prnt.sc/257kxxa lowercase l and uppercase I are quite distinct. If you're going to select all your filenames with autocomplete, then why argue restrict me to case sensitive filenames?

Windows NTFS is sometimes described as case-preserving, case-insensitive; i.e. if you name a file "Test" it will stay in that case, but if you ask for "test" it will find the file "Test". I don't know whether that happens in Win32 or NTFS, but it seems like best of both worlds; I don't want "case insensitive" where it could show me the name in a different case than I entered.

link

hvdijk 1637 days ago

For the record, this is how lowercase l and uppercase I show up here: https://prnt.sc/258zzy0

link

jodrellblank 1637 days ago

And this is how they show up here: https://prnt.sc/259t3qy

I don't override the font, either HN or FireFox or Windows is picking a serif font. I assume you didn't explicitly choose a font where different things look the same, but if it rendered 'a' and 'Z' the same glyph, you wouldn't say that was a problem with the English alphabet or with case sensitivity, or anything other than bad font design, right?

link

hvdijk 1637 days ago

You showed a screenshot earlier, I know what they look like for you. :) I don't override the font either. HN specifies a CSS font family "Verdana, Geneva, sans-serif", where Verdana is installed by default on Windows but is not a full sans-serif font, so it's a somewhat odd choice by HN. As for the English alphabet, I didn't say there was a problem with it, just that you cannot reliably tell just from looking at letters which letters they are. In practice it is not an issue.

link

jodrellblank 1637 days ago

Frankly, the use case where I make a file to send to a customer "Example Ltd DNS details" and want to open it from a command line happens to me infinitely more often than the use case where I have a file called "fiIe.txt" and want to open it from a command line.

link