Hacker News new | ask | show | jobs
by hx87 3498 days ago
Chinese people will be fine since all Chinese URLs are either ASCII compliant or use Chinese characters, which can't be confused with any ASCII characters.

Russians would definitely be pissed though.

2 comments

To my understanding the unicode standard encodes an ASCII transliteration of an Unicode symbol to itself, but what about typographical similances? Wouldn't that be a hard problem? Perhaps there are two unicode characters that look exactly the same (using a given typeface) but have different transliterations. Or vice versa - two totally different looking characters share transliterations and gave false alarms.
Just handle .рф domains (and the Serbian Cyrillic ccTLD) specially.