Hacker News new | ask | show | jobs
by estebank 2881 days ago
This is a very anglocentric view of the internet.

Most of the problems with the full unicode set can be sidestepped by a combination of UAX #31[1], NFKC[2], ignoring ligatures and digraphs[3], and following UTR #39[4].

Cyrillic apple.com is one of the few cases where it is still problematic and extra UI feedback would be needed.

[1]: http://unicode.org/reports/tr31/

[2]: http://unicode.org/reports/tr15/

[3]: https://www.unicode.org/faq/ligature_digraph.html

[4]: http://unicode.org/reports/tr39/tr39-1.html

3 comments

> This is a very anglocentric view of the internet.

Yes, it was. The grandparent is literally saying that, as an American, punycode is primarily a risk to them, not a feature.

> Most of the problems with the full unicode set can be sidestepped by a combination of...

By a combination of 4 different, complicated things that most technical users know little about and non-technical users know nothing about? And problems still remain? That doesn't bode well.

> Yes, it was. The grandparent is literally saying that, as an American, punycode is primarily a risk to them, not a feature.

...arriving to the conclusion that six billion people[1] having a degraded experience (sometimes severely) is a good trade-off. As somebody else down-thread mentioned, browsers targeted at anglophones maybe should make Cyrillic characters always obvious, but that doesn't mean this should be the default for everyone. The part I disagree with the gp with is in that "no one wants it".

> By a combination of 4 different, complicated things that most technical users know little about and non-technical users know nothing about? And problems still remain? That doesn't bode well.

I don't see how "most technical users[...] and non-technical users" have any need to learn about those "4 different, complicated things", only people directly working on User-Agents and networking have any need to understand those documents.

[1]: People that speak some level of English total ~1 billion https://blog.esl-languages.com/blog/learn-languages/most-spo...

I'm no anglocentrist, I read and write in Japanese and I'm acutely aware of the technical and social problems and motivations for punycode.

>Cyrillic apple.com is one of the few cases where it is still problematic

One of the many, you mean. Punycode has actually been implemented in several places and rolled back because the problems are so severe.

It is anglocentric and valid. I live in a country of over 300 million people almost all of which speak English. Almost half the people on the internet at present speak English and virtually all the content I could possibly want to consume is available in a non unicode domain.

For me and 300 million users avoiding malicious attempts at spoofing is important. Setting show punycode to true enables me to view pages in international domains in case I need to do this while preventing me from being exploited.

Its at present the best option for hundreds of millions of people.

Its not shortsighted or racist to acknowledge different populations of users have different needs.

Edit: In case people can't be bothered to read it should be obvious that I am advocating for shipping with show punycode true for the English US version of firefox.

I think people are downvoting you because it seems like you are suggesting that because it's OK for you billions of internet users have their functionality downgraded.

If you feel like punycode is a security issue then you should disable it. Perhaps browsers could do this automatically for people like you. But that's on you - saying 'nobody wants it because i speak english' is not a great foot to stand on.

People overwhelmingly use defaults. The default shouldn't be problematic. Since we already ship an us english version among many others THAT version should have this feature disabled or show the actual punycode in the url bar.

I'm honestly unsure how you can possibly make a browser that allows look alike characters secure against phishing but at least its a different sort of trade off when you are talking about populations of users that might actually encounter non phishing sites using these domains.