| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by squaresmile 2395 days ago

Unicode definitely has flaws but that doesn't mean we should throw the baby out with the bathwater and go back to "ASCII and other character sets." There's a reason we moved on from that world. However, I bet we will see another encoding coming up eventually (within 30 years) which solves the problems Unicode currently has and introduces a new set of problems as well. I saw this comment [0] about how that encoding should get started.

> Greek, for example, has a lot of special-casing in Unicode. Korean is devilishly hard to render correctly the way Unicode handles it. And once you get into the right-to-left scripts, scripts that sort-of-sometimes omit vowels, or Devanagari (the script used to write a bunch of widely-spoken languages in India), you start needing very different capabilities than what's involved in Western European writing. _The better approach probably would have been to start with those, and work back to the European scripts_

[0] https://www.reddit.com/r/programming/comments/b09c0j/when_zo...

Funnily enough, URLs still can't do actual Unicode.

1 comments

BurningFrog 2395 days ago

Unicode URL has serious security problems.

The canonical example is google.com vs gооgle.com.

link

riquito 2395 days ago

That was solved years ago by IDN/Punycode (implemented by any browser worth their salt).

link

jagracey 2394 days ago

Commented above, but to follow up from yesterday, here is the next post.

"Hacking GitHub with Unicode" https://news.ycombinator.com/item?id=21693550

link

jagracey 2395 days ago

I agree. I'll be releasing an article about this tomorrow. There are in-fact many security ramifications that have not been solved in practice.

link