Hacker News new | ask | show | jobs
by jstimpfle 3499 days ago
There will never be agreement what's the set of distinct characters (also, what characters should be included, bitcoin logo, facebook logo?)). I see Unicode as a necessary evil. Due to its complexity most applications should treat Unicode text as black boxes.

I never rely on Unicode for computation. When receiving Unicode I always make sure it's in the ASCII range. It could be argued that there should never have been Unicode domain names but I guess Western people are very lucky that ASCII includes most of their characters...

1 comments

> When receiving Unicode I always make sure it's in the ASCII range. [...] Western people are very lucky that ASCII includes most of their characters...

Please don't spread the myth of Western languages being encodable in ASCII, and don't pretend to support Unicode (or anything else than English) if you filter everything to ASCII.

The _only_ Western language that is encodable in ASCII is English.

Corollary: English is the only language that can be encoded in ASCII.

The other western languages have endless issues with text being encoded/stripped down to ASCII. e.g. French, Spanish, Portuguese, German...

As a german I can attest that I can very well converse (e.g Email) in ASCII. Although it's convenient to use Umlauts, which I do. And I also agree that French or Spanish might be less convenient.

But that was not my point. The point was about identifiers, such as DNS names.