|
|
|
|
|
by YeGoblynQueenne
2144 days ago
|
|
>> I think in general we pack a concept in a word and lose some information this way, so when you want to be precise with what you are saying you have to bring your definitions with you. Essentially with translation you take a concept and "pack" it in a word, then look for an equivalent packing in different language, then unpack. Naturally, this process is prone to losing information. This is a plausible description of how humans perform translation, but it does not apply to machine translation, because we have no good way to represent the meaning of a word other than with the word itself. Consequently machine translation systems can't distinguish between different meanings of the same word and instead try to produce a correct translation by relying on frequency-based heuristics: faced with two likely translations of a word, a system will try to determine the context of the word (in terms of its collocation with other words) and then assign to the word the meaning it has in the context that happens to be the most common according to its training dataset. Clearly, that is like "flying blind"; sometimes it will work, sometimes it will fail and there's no way to predict beforehand which. The comment above gave the "spring" example, my routine example is asking Google Translate to translate Greek "χελιδόνι" (the bird, swallow) to French and getting "avaler" (the verb, to swallow) instead of the correct "hirondelle", again because translation goes from Greek to French via English, introducing ambiguity about the intended meaning of "swallow" that does not exist in either Greek or French. Note that this doesn't happen when the word "χελιδόνι" is used in a sentence (e.g. "ένα το χελιδόνι" translates to "un l'hirondelle", which is ungrammatical and nonsensical but at least gets the right noun), but it's a good test to show that Google Translate is really incapable of recognising the meaning of words and so cannot use such information to make translations. Note that the same goes for machine translation in general, i.e. Google Translate is a state of the art system. |
|