Hacker News new | ask | show | jobs
by jesprenj 1053 days ago
Yeah. Though they have a potential bug with domains that end in a dot.

For example, my website translated with their service:

https://splet-4a-si.translate.goog/?_x_tr_sch=http&_x_tr_sl=...

Clicking on the link "version control" leads to the page

https://ni-xn----ijanec--9jb-eu-.translate.goog/?_x_tr_sch=h...

(note the second label ending with -) and this page can't be reached according to google, but on the real site (http://splet.4a.si), the link is valid: https://ni.xn--ijanec-9jb.eu/

Another pretty bad aspect of their domain handling (replacing dot with -) is that they can't support really long domains that exceed the length of a label. Secondly, this breaks punycode parsers and instead of "šijanec" as a label you get some garbled unreadable punycode.

Label length problem:

Website http://splet.sijanec.sijanec.sijanec.sijanec.sijanec.sijanec... cannot be unambiguously page-translated with their service. -- they remove the first s in the hostname, yielding:

https://plet-sijanec-sijanec-sijanec-sijanec-sijanec-sijanec...

EDIT: they seem to add a get parameter _x_tr_hp=s, so this is probably how they work around the limitation. Nice!

They could've done www.website.example.translate.goog, but that would add issues with TLS, since a wildcard subject alternative name only matches one label. But by getting a CA signed cert, limited to translate.goog (which is possible), they could just issue certs on the fly.

1 comments

The amp version gets around the length by hashing the hostname if it's too long:

https://amp.dev/documentation/guides-and-tutorials/learn/amp...

Interesting, but that means they have to store the hostname somewhere.
It's repeated in the URL path, not just the hostname.
Ah, I see!