|
|
|
|
|
by kurtosis
5570 days ago
|
|
Are you really sure about this? It sounds equally plausible to me that not knowing both the source and target languages would give one an advantage of not relying on ad-hoc, hard-to-model, human judgements. Being monolingual is more likely to enforce a discipline where one develops an algorithm which would work effectively on all natural languages. IIRC the 'candide' group was (not intentionally) composed of scientists with no knowledge of both english and french.. http://www.cs.cmu.edu/~aberger/mt.html |
|
Contrary to your point, early statistical machine translation only works well for relatively close language pairs, like English-French or English-Spanish. It totally fails for more distant languages such as Chinese, Arabic or even German, which is why you have so many Chinese-speaking people (including English-Chinese bilinguals) in machine translation these days.
Parallel corpora are full of ad-hoc, hard-to-model, human judgements (from people called "translators"). The advantage is that the translators don't come up to you to criticize your translation model; however doing error analysis for an MT system (i.e., the key to actually improving things and not producing garbage) requires at least minimal knowledge of the source language and relatively good knowledge of the target language.