Hacker News new | ask | show | jobs
by Uplink 2845 days ago
Who do I tell my brand new, bright idea about IDN attack prevention?

It goes like this:

Display characters that are out of range of your selected language's character set in a different colour than the characters of your language.

That way, when you go to раураӏ.com that last character shows up in red.

Homework: select two languages (e.g. Chinese and English), and use three colours. Make the colour scheme colour-blind-friendly.

(it just came to me, I haven't thought it through; I'd rather read different coloured characters than punycode)

4 comments

Colouring is not very universal, and so not generally a good choice. Browsers have taken different approaches to this, two popular ones are:

* Identify TLD registry operators who have a sane approach that prohibits or otherwise is effective for controlling homographs, whitelist their TLDs, default to showing punycode (the A-labels used by the DNS system which are always just ASCII). This has the effect that if your name looks "wrong" that's a problem to take up with your TLD registry. Note that com doesn't have such policies at all, it's a vast sleazy market and it remains interesting to me that huge global brands would rather be in that market, trying to shout over the crowd, than leave it to rot.

* Identify cases like you've described with "confusing" mixtures of scripts and display those as punycode.

Both have problems. The former requires that you effectively police TLD registry operators. Find out what their policies are, check they actually implement those policies effectively, and take action if this changes. The latter requires you figure out how all the world's language communities use different scripts, and how that interacts with Unicode, in order to avoid penalising combinations lots of people want, while still detecting attacks.

> when you go to раураӏ.com that last character shows up in red.

All of them would show up in red, all of them are cyrillic. If all but one character was cyrillic, Firefox would detect this and render the url in punycode. As implemented now, firefox renders the URL as shown because all characters are from the same script (~character set). Chrome is more suspicious and renders it in punycode (https://xn--80aa0cbo65f.com/), though it would presumably render the confusing version if my locale was Russian.

They're supposed to display in punycode characters outside of your locale.

Maybe whitelisting specific characters instead of only locale would be best.

Your color idea is neat but look at it from the perspective of people that actually use idn domains and don't speak english. What if раураӏ.com was раураӏ.fans-ùøê.com where ùøê was part of the same charset as ӏ and the user speaks both locale. Even if you color it,it would be difficult to train users to pick up on that.

Punycode renders as ascii in chrome (not sure about ff), people will notice the malformed url and that something is wrong.