Hacker News new | ask | show | jobs
by Groxx 4456 days ago
Hmmm. I like the idea of a shared database, but on the flipside you "only" need your developers to precisely understand the language they're using. The vast majority of people have no need for a deep understanding of their native language, I'd be worried about subtly-wrong translations. Something is better than nothing, but weird is still bad. For example, you use "you" in "You have {count||message}" - there's no gender or other meaning encoded, it'll probably be a bit off for some languages.

More technically: how does this translate?

>This is indeed [bold] a very [italic: powerful] framework][/bold]

To make a hypothetical worst-case scenario, say one of the target languages combines "this" and "powerful" into a single word, splits "framework" into three, and reorders it so conceptually you flip between bold and italic and plain practically at random. There's (probably) an element of aesthetics combined with the text's meaning, can the crowd detect that aesthetic? What if it varies all over your site because it's sourcing from different crowds?

Similarly, say you refer to "this" thing on the website/app, which is up and a little to the right, but not this other thing to the left. RTL probably switches that, and does "up and a little to the right" have an encoding in TML so that the meaning is retained if a language has a specific word for that?

1 comments

Groxx, you bring up very good points. Let me see if I can address them all.

In the case of "You have {count||message}" as with actually pretty much any translation key, an implied "viewing_user" token is automatically added by the Tr8n SDK to provide the gender of the viewing user. So in languages, like Hebrew, where "You" depends on the gender of the person you are referring to, there would be actually 2 translations for the key:

"Yesh leha {count||hodaa, hodaot}" for {viewing_user: {gender: male}}

"Yesh lah {count||hodaa, hodaot}" for {viewing_user: {gender: female}}

The SDK would know how to pull the right translation and do the substitution at the time of the expression evaluation.

Here is what inline translator tool looks like when you deal with such a situation:

Notice there is a link below the text area to generate context rules for the phrase:

http://grab.by/vMTk

When you select that option, you can generate all rule permutations for the key:

http://grab.by/vMTo

http://grab.by/vMUc

Alternatively, you can use TML and provide a single translation:

"Yesh {viewing_user| male: leha, female: lah} {count|| one: hodaa, other: hodaot}"

Or using the shorthand notation:

"Yesh {viewing_user| leha, lah} {count|| hodaa, hodaot}"

Single pipe means "use the context, but don't display the value". Double pipe means "use context and display the value".

Looks a bit cryptic, but once you get a hang of it, it becomes a second nature. Try doing this in I18n.

Even in English you can do things like:

"{user| Born on:}" - in other languages depends on user gender

"{user| He, She} likes this"

"{users|| loves, love} this"

Context Rules and Language Case Rules are defined per each language. And the conversion happens on the fly when needed.

For example, in Hebrew, the last example would need many more options:

"{users|| male: ohev, female: ohevet, males: ohavim, females: ohavot, other: ohavim} ze"

Because in Hebrew it matters whether all users in the list are male, female or mixed.

In Russian, it needs 2 cases - same as English (but only in present tense):

"{users|| lubit, lubyat} eto"

But if the key happened to be in past tense like:

"{users} liked this"

In English is same for whether there is one person in the list or many.

In Russian, we get 3 cases:

"{users|| lubil, lubila, lubili} eto"

Etc...

For the decoration tokens in a sentence like:

"This is indeed [bold] a very [italic: powerful] framework [/bold]"

It is up to the translator to completely shift things around in any possible way to convey the meaning of the original key. The only thing the SDK does is try to ensure that you don't try to inject anything ugly into the sentence. So lets say the translation to the above in Elbonian would be:

"Munchu punchu [bold: munchuchu] punchu"

Since the Elbonian language doesn't even have a separate word for "powerful", but instead a single word for "a very powerful framework", so they had to omit the italic part altogether. And that is perfectly fine. The translator's job is to provide the best translation that would convey the most meaning of the original sentence. Well, and they should try to massage the decoration tokens the best they can, if they can at all.

Since they could use the inline translator tools, they should be able to see their result right away and try to make it look the best they can in the space where the key appears on the site.

The beauty about the decoration tokens is that they are replaceable and reconfigurable even after you have hundreds of translations for the key. As I was mentioning earlier, in Rails the tokens will be replaced with HTML, but in iOS you can create an NSAttributedString and reuse all translations, etc...

I didn't fully understand your "up and a little to the right" example. If you can give me an actual example, that would be great.

TML works in RTL and LTR cases equally well. And since you are the one controlling the container of the app, you can use CSS to make it look like whatever you want.