Hacker News new | ask | show | jobs
by Xylakant 3488 days ago
> If you allow unicode characters, then you can have two character sequences that look the same but are actually distinct.

That would be a software bug. If you want to compare unicode strings, you need to normalize them first following the rules laid out in the standard. https://en.wikipedia.org/wiki/Unicode_equivalence

git for example fails that test. (try creating a repo with a file named 'ΓΌ' and check it out both on a mac and a linux system)

There are more issues about glyphs that are a distinct character but look the same in a given font, but what's your proposal? All people transliterate everything to ascii? Display punycode in URLs?