Hacker News new | ask | show | jobs
by travisjungroth 1744 days ago
I think the issue is many translation databases just hold the English text and then all the translations. So the entry is “Open ticket” and then you just drop in the translation anywhere that phrase shows up. But sometimes “open” is a verb, sometimes a noun.

The actual identifier should be something like “Open a ticket (imperative, button)” and then that phrase has translations, including the English “Open ticket”.

12 comments

That's actually a nice idea in forcing even the default language to be handled by the same workflows, processes, and tools as the other languages. I've found that in a lot of those cases there simply is lots of context missing from the strings that should be translated. If all you get is the English text without any indication of how and where it's used in the UI, you're bound to make such mistakes in the translation.

For example, it took me a while to figure out why Word 2007 in its German version used the word »Gliederung« for the stroke of a shape. But translating »outline« in a word processor to mean »document outline« instead of »shape outline« is actually quite understandable.

Back then I tried thinking about automatic or semi-automatic solutions to get a bit more context for the translator. The trouble is that most UI toolkits make it very hard to impossible to solve this, unless the developer actually knows enough about the problem to always include context and a description. Qt has (had? That was pre-QML, I think) a nice mode in its translator UI where the XML UI description could be used to show the string in its UI context. Windows Forms had a way of changing the form's language and simply replacing all strings directly in the designer (which has the problem that the translator might accidentally destroy all layout). Most things that are used just from source code have no visual way of relating strings to UI at all.

In most places I worked that used translation systems, all languages where translations including the default one. Within code using message keys like "thing.title", "thing.add_action", "thing.on_save_error", etc or something like that.

I really like this approach because it makes the code and especially templates much more readable. You usually don't care about the verbose form of the text that should be displayed and those type of keys give you just enough information to understand what it is.

Problem is, it makes it harder to outsource the translations, and well, as it is known, naming things is hard.

Oh, cool. You just reminded me of a feature I had built into my web app many years ago when we implemented translations. We accepted internal commands in our search box, and one of the commands told the app to display the language text identifiers alongside the language text. It was a great mode for developers, QA, and translators. Developers and QA could easily locate text that needed to be put into the language system, and translators could work page by page to find the identifiers they needed to translate.
Yep, there's a ton of software that wants to use the english text as the translation key, which leads to all sorts of bad results.

Things like Open Ticket used as a verb to create a bug report, or open a bug report, or as an adjective to indicate a bug report is still active. Or similar when ticket means a transportation or entertainment event. If your key is the English text, you can't translate those three usages differently which is not good.

But also, minor edits to the English text are hard to manage for the translations, some systems have a way to suggest an existing translation, but it requires a translator to affirmatively select it. If the key doesn't change, you can still use the existing translations until the translators review the English change and decide if they want to also make a similar change or not.

Of course, the worst thing that people try to do is numbers; there are tools for that, but trying to do Open Ticket vs Open Tickets as singular vs plural falls over with languages that have a form for one, two, three, or more, or even more forms.

And then you get people trying to do string math. Delete this ticket vs Delete this image need to be translated as whole units, you can't add 'delete this' to the type name, gendered verbs and objects and sometimes even more complex stuff makes it not work.

I was using Qt recently and saw that example code did that English keying. It's good that they're promoting creating translatable UIs, but I don't know if it's the right thing to do if they're encouraging people to do it by using English text as the key.
I think that approach is quick and easy for an English only developer to understand and do, but it's hard to get quality results. A synthetic key tied to the context so the same English text can be translated differently as appropriate.

Tools that show translators the application context are really helpful, too. Bulk translation in a spreadsheet is an OK place to start when there are a lot of new translations to do, but everything needs to be checked where it's used as well. Especially for languages that tend to result in layout issues when added to formerly English only apps, like German (lots of very long compound words) and LTR languages like Arabic.

>The actual identifier should be something like “Open a ticket (imperative, button)”

Or even better: "ui.ticket.actions.open" — trying to shoehorn linguistic categories into translation files is a painful experience, but dumb specific IDs work great and make untranslated captions apparent.

Or even Create Ticket. From the description, it seems a ticket is being created not opened, despite the slang that people incorrectly use for buttons.
If a ticket is closed upon completion, it stands to reason that creating a ticket is "opening" it.
> If a ticket is closed upon completion, it stands to reason that creating a ticket is "opening" it.

That's faulty reasoning.

At the inception of a ticket, it is first created and then opened. It is common to have these programmed to work as a single button push, but they are two actions, and creation always happens first, even when the ticket is not opened. Later, when the ticket is completed, it gets closed.

When you go into your house, an opening must first be created, either a doorway or some other hole in the wall like a window. Then, after the hole is created, you can enter the house.

Not faulty reasoning, good UX.

What you’re describing is putting a priority on technical correctness instead of how the user will experience it.

Open and Close are natural opposites, and intuitive UX. Does the user care that technically the ticket needed to be created before it could be opened?

Or you have a list of tickets that you have purchased in the app, and may need to select one from the list to show to the controller.
Or maybe the translator should get annotated screenshots of the app. Laid out in a story board.

If you just hand someone a list of strings to translate, there's no way you'll get sensible results.

Also, you should test you translations. Some of the translations that I've seen (even from reputable companies) are so bad that it's pretty obvious no native speaker has ever looked them over.

The tension is that people want to reuse translations. So maybe you have the story board for the first version. Then, someone makes a new button. In Django, they'd put _("Open ticket") as the text, see there's already a translation, and think they're good to go. Sure, having every page looked at by a translator for every language every time you make a change would be ideal, but also a bit costly and slow. I think there are better options in the middle.
I don't think there are any "middle options" that result in a good product. You want a localized app, but you don't want to put the work in.

If you add some new text somewhere in the UI, you need to start the app and make sure it looks right. If you only do that for one language, and don't check other languages, then there's going to be one language that's broken.

So your app is going to look broken in one language. And you probably will never find out, because the people who run into the bug don't speak your language.

> You want X, but you don't want to put the work in.

Yes! That is exactly what people want.

> Also, you should test you translations. Some of the translations that I've seen (even from reputable companies) are so bad that it's pretty obvious no native speaker has ever looked them over.

Testing is a must. Before I fixed + linted it, we often had community-provided translations which would cause the app to crash due to missing/additional format strings.

Yup. The “key” for the translation is the English phrase itself. It also makes English text changes weird because you either have to change the key across all languages to match the new English or you leave the key alone and change/add an English “translation” that is the new text.

Personally I think it is better to use a “surrogate key” that isn’t the English text itself.

I think a bit more of context would even be better.

The usage of "discover" and "find out" in english and portuguese comes to mind.

words from the "discover" family, in the english language are generally used when talking about discovering something that nobody or few people knew (somebody discovers a cure for some disease), while "find out" is generally used at a more personal level (somebody finds out that someone else bumped his car)

in portuguese you can only "find" (encontrar) physical things. you can't "find out" information

This makes me thing that in some instances it might be necessary some sort of descriptive context on the meaning

Obj-C and Swift allow comments to go along with translation strings, so you would add

    let buttonStr = NSLocalizedString("Open ticket", comment: "For user to open a ticket")
And the comment would make its way into the eventual xliff file sent to translators
This looks like the right approach: giving context to the translation strings.

Many translation tools give the locations in the code where a string is used. It's a first step, though translators are not always able to read code.

In my job I sometimes have to do with rollouts of a centrally maintained software to subsidiaries in other countries. Translation is often done by simply sending a Excel file with all string identifiers and their English value to a key user in that country, and maybe they translate it themselves, maybe they give it to some agency. So we can be 100% sure that now, there will be several additional rounds for them to figure out what the string is really supposed to mean versus what they initially thought it would mean.

Yes, we improved on this somewhat. In the last rounds they got access to the software in English beforehand, and there are now also access keys to press to see the string id for any label in the app. It is still a very time consuming process, and I love it when a rollout is done to a country where we can just say, consumer facing texts get translated, our employees all speak English well enough to use this as is.

> The actual identifier should be something like “Open a ticket (imperative, button)”

The identifier should be a GUID with an option for tagging and comments for developers and translators. Other languages depend on other contexts which are not represented here, like multiple forms of "present" tense, or the current time of day. Keying translations on English-language concepts is a bad idea, as a lot of languages are unlike English. Treating English as one of the translations (and not the reference) is a good idea that will prevent problems with future translations and avoid re-architecting your whole localization pipeline (and code base).

That sounds like a nice setup.
This is what the gettext contexts and the pgettext macros are for: https://www.gnu.org/software/gettext/manual/html_node/Contex...
Another example I saw was "(N) seconds ago" (e.g., "30 seconds ago") vs. just "seconds ago" (i.e., someone just posted this). They were translated into a single phrase. Hilarity ensued.
I think Mozilla's translation system called Fluent can handle that.

https://projectfluent.org/