Hacker News new | ask | show | jobs
by maaaats 2407 days ago
Ouch. I have a ü in my last name from family in a different country way back somewhere. It doesn't crash systems as much as an apostrophe would, but it's very good at showing encoding issues between systems..

It's not as big an issue as it used to be, at least. Before I've had online transactions failing because of a mismatch between my name (with ü), and the name on the card (with u). The systems seem more forgiving now, having handled that case or something. I also remember being a bit scared traveling to Japan many years ago, as we were told it was SOoo important that the names and everything matched to gain entry. And then the name on my ticket was completely mangled. But no one cared.

Here's a SO post about someone with the last name Null: https://stackoverflow.com/q/4456438/923847

2 comments

The Japanese are quite used to mojibake [1], so they would've understood immediately that the mismatch between your ticket and passport was caused by encoding issues.

[1] https://en.wikipedia.org/wiki/Mojibake

Interestingly, I've had problems in Korea (Gimpo Airport) because my name contains an "ö", and the canonical spelling in the passport for this is "oe". This was cause for much confusion among the airport staff.

I would have thought that people from CJK-countries were more understanding of encoding-to-latin weirdness than most, but apparently not.

I think their understanding would be focused on the encoding for their language and a relatively narrow set of problems. I've encountered name issues in CJK countries that keep names in native encoding due to an assumption that full names fit within a couple of characters with no need for any spaces or punctuation. Some systems might be designed to be "accommodating" and take even up to 8 or 10 characters! There was one train system where my name had at least four different iterations through the tickets I collected, with different ordering of first and last names and truncating.
In defense of the Korean airport staff, they might have been more accommodating if the "ö" was completely and obviously broken, like "£‡�". Spelling it as "oe" makes it look like there are no encoding issues, in which case strict checking makes more sense.

It's much easier to identify mojibake (they tend to be extremely obvious in CJK encodings) than to remember canonical spellings and other variations in a whole bunch of different languages. Airport staff probably know that "oe" and "œ" are interchangeable, but that's about it.

Diacritics are usually stripped in air travel. In Hungarian we have many letters with diacritics, but it is never a problem that the passport has them and the system doesn't.
> Diacritics are usually stripped

Not in all cases. In Germany and Finland (maybe all EU passports???) ä is spelled ae, ö is spelled oe in the machine readable part (umlauts shown in the "human-readable" part). This is important to know when you need a visa.

For Germans this is not a big problem because it has been like this forever if the umlaut is not available for technical reasons. For Finns this is a problem, because this "transcription" is completely unknown in Finnish. For a couple of weeks now it has been possible to get an electronic visa for Russia on the internet. Reportedly many Finns with an ä in their name (that's not uncommon) dropped the dots when applying for their visa, because an ä is not accepted. At the border they were not allowed to enter, because the machine-readable part of the passport has ae instead.

Good point, I don't know any Hungarians with ü or ö in their name, just á and é.

I do wonder what happens to ű and ő though.

There is an ICAO recommendation. However, it is not unambiguous and of course it's not legally binding. So in the end every country decides what they do. (Possibly there are more multinational agreements e. g. inside EU, but I doubt there is anything truly worldwide.)

https://www.icao.int/publications/Documents/9303_p3_cons_en....

Ü is written as UE, UXX or U

Ű is written as U

According to https://en.wikipedia.org/wiki/Machine-readable_passport#Name... Hungary uses UE for Ü, but there is no reference given. According to the same article Russia uses even 2 different transliteration systems depending on the type of document.

For German names, this is a problem. I have an ü in my name and this is transcribed as a "ue" in my passport. Transcribing it as u would produce a different name (which AFAIK actually exists).
In Hungarian the diacritics are also important, for example Szilasi and Szilási are different and are pronounced differently. Still, it won't be an issue when flying or other stuff.

German is more complicated though with all the substitution rules.

Not to mention Germans who actually have an ue in their name, still pronounced as ü, but written as ue only, never as ü. Or someone may be called Gross, but it would be incorrect to write it as Groß, while someone else's name may be Groß with the acceptable alternative spelling Gross when ß is unavailable.