Hacker News new | ask | show | jobs
by xapata 3182 days ago
That depends on how often that pun appears in the corpus. If you observe the translation enough times, it'll be easy for the computer.
1 comments

this misses the point. There do exist untranslatable idioms, and puns are an easy example. They rely on language-specific features like rhymes (or sight-rhyme) or homophones that are not preserved across corpora.

You can absolutely make a "direct" word-for-word translation of a pun in english to, say, russian. It's just not a pun any more when you're done. Often there are no "pun equivalents" with totally different words, because usually they hinge on culturally specific references that also don't translate well.

Basically none of this matters when what you're interested in is subway directions or ordering food or whatever, but it becomes intractable really fast whenever you're interested in talking about something more meaningful.

Au contraire, it doesn't matter if the pun is translated directly. Heck, Google might be doing whole paragraph translation for all we know. It's certainly not at the level of individual words.

Translate words and it's gibberish. Pairs of words and you start to get slang. Triples and you can distinguish word that are different parts of speech in different contexts. Quads and grammar is mostly in the bag. 5-grams and most puns are handled. 6-grams and you've taken care of all simple sentences. Etc.

No need for semantics when n-gram counts does just as well.

With enough people talking, we'll eventually have taught Google all the translations for all possible sentences. (joking, but only halfway)

>it doesn't matter if the pun is translated directly.

you can't do this anyway.

>No need for semantics when n-gram counts does just as well.

They don't. neither do bag of words, word2vec, or whatever.

Simple imperative language? Absolutely, this all works pretty well. Anything else? Ha.