Hacker News new | ask | show | jobs
by donquichotte 3499 days ago
Some time ago I registered http://www.goolge.io/. Still haven't done anything with it, I guess at some point I'll just redirect it to duckduckgo. [EDIT: now it's redirected to duckduckgo.]

This can of course be used in a malicious way. I thought about rebuilding the homepage of the bank Credit Suisse on www.credit-siusse.ch, but that's probably illegal.

3 comments

Most browsers should forcibly transcribe this to Punycode[1]:

    https://www.𝙿𝙰𝚈𝙿𝙰𝙻.com/
And yet when I paste this into the latest Firefox it redirects to https://www.paypal.com/

No 301 redirects or anything, the browser just treats it like ASCII, which it is clearly not, it actually happens to be Fullwidth:

https://en.wikipedia.org/wiki/Fullwidth_form

Serious phishing opportunity if you ask me!

[1] https://en.wikipedia.org/wiki/Punycode

Nope. The browser is behaving sensibly, since you can't register that domain. It's applying the same rules that the registrars do.

ICANN require that registries follow RFC3491 and related RFCs for name prep before allowing a name to be registered https://www.icann.org/resources/unthemed-pages/idn-guideline... . What that one does is (among other things) NFKC normalization and case-folding:

    irb(main):016:0> "\ufeff\uff30\uff21\uff39\uff30\uff21\uff2c"
    => "PAYPAL"
    irb(main):017:0> "\ufeff\uff30\uff21\uff39\uff30\uff21\uff2c".unicode_normalize(:nfkc).downcase
    => "paypal"
Interesting. So, out of interest, why is the same not being applied for ɢ? (When I ran it through Python's unidecode I got the roman symbol all the same).
Because 'small capital g' doesn't have a compatibility decomposition to G, but wide letter P does have a compatibility decomposition to 'normal' P. Unicode normalization kills large classes of homograph attacks but by no means all. conventions over mixing scripts from different languages stop some more, but there's no single answer.
Doing the "ɢ" conversion here[0], I get

  xn--1na
[0]https://www.punycoder.com/
The problem is that the RFCs aren't restrictive enough, partly because the IETF doesn't have much authority over registrars. The domain name rules really ought to be something like "one script, plus numbers, in a domain name part". But this runs into such things as the tendency in Japan to mix kanjii with English words. Then there's the whole right-to-left mark business, which has to coexist with left-to-right TLDs.
So if I mix ASCII with obscure UTF8 characters like the domain in OP's post I can register it then?

Something like www.paypal.com --> www.n--pal-n76secrc.com

No. When you apply NFKC normalization to that string, you get just 'paypal', so Paypal have already registered the result. You can try that here: http://mct.verisign-grs.com/ - notice how the output is not the same as some online converters based on punycode.js, because that doesn't have nameprep support https://github.com/bestiejs/punycode.js/issues/40
Those characters are not fullwidth.

This:

www.paypal.com

or this:

www.PAYPAL.com

would be fullwidth.

What you actually posted are characters in the Mathematical Alphanumeric Symbols block. Specifically:

𝙿 — U+1D67F MATHEMATICAL MONOSPACE CAPITAL P

𝙰 — U+1D670 MATHEMATICAL MONOSPACE CAPITAL A

𝚈 — U+1D688 MATHEMATICAL MONOSPACE CAPITAL Y

𝙻 — U+1D67B MATHEMATICAL MONOSPACE CAPITAL L

How is that a phishing opportunity if it redirects you to the real website?
That looks a lot like the sort of trademark use that authorities have deemed infringing, I'd expect the registrar to "recover" that unless you've got a clear explanation (like Goolge is your name, even then ... (remember Nissan, Mike Rowe, etc.)).
I registered http://www.4ppl3.com a while back. No potential for abuse really, but I just thought it was fun to have a l33t-speak version of the domain name of one of the world's most litigious companies.

That said, I haven't done anything with it, and I'm not a domain squatter, so if anyone wants it I can hook you up!

l33t-speak is so far off my radar that I was wondering if there was some hot startup called '4 people 3' or somesuch. I doubt anyone at Apple remotely cares about l33t-speak from a branding perspective.
To be fair, I don't think I know anyone who "cares" about l33t speak.

H4xx0r j0k3s is all it is.

I never seen anyone go out of their way to defend or use leet speak all day long.

In some countries, it's the best that can be done for vanity car registration plates.

e.g. a site selling them in the UK is promoting "JO66 ERX", which is probably supposed to be read as "Jogger X". Current bid £750, for some reason.