|
|
|
|
|
by hprotagonist
3182 days ago
|
|
this misses the point. There do exist untranslatable idioms, and puns are an easy example. They rely on language-specific features like rhymes (or sight-rhyme) or homophones that are not preserved across corpora. You can absolutely make a "direct" word-for-word translation of a pun in english to, say, russian. It's just not a pun any more when you're done. Often there are no "pun equivalents" with totally different words, because usually they hinge on culturally specific references that also don't translate well. Basically none of this matters when what you're interested in is subway directions or ordering food or whatever, but it becomes intractable really fast whenever you're interested in talking about something more meaningful. |
|
Translate words and it's gibberish. Pairs of words and you start to get slang. Triples and you can distinguish word that are different parts of speech in different contexts. Quads and grammar is mostly in the bag. 5-grams and most puns are handled. 6-grams and you've taken care of all simple sentences. Etc.
No need for semantics when n-gram counts does just as well.
With enough people talking, we'll eventually have taught Google all the translations for all possible sentences. (joking, but only halfway)