|
|
|
|
|
by sshine
1480 days ago
|
|
I used to keep track of the state of machine translation some years back. I think the way you measure the success of an automated translation is edit distance, i.e. how many manual edits you need to make to a translated text before you reach some acceptable state. I suppose it's somewhat subjective, but it is possible to construct a benchmark and allow for multiple correct results. The best resources I knew back then were: VISL's CG-3 self-reported a competitively low edit distance compared to Google Translate: https://visl.sdu.dk/constraint_grammar.html -- It is a convincing argument that in order to beat Google Translate, you want less fuzzy machine learning and more structural analysis. But the abstraction unfortunately requires a rather deep knowledge of any one particular language's grammar; having a PhD in computational linguistics helps. Apertium has an open-source pipeline: https://apertium.org/ -- seems to be much more like an open-source approach with a quality similar to Google Translate (although I don't know if it's better or worse; probably slightly worse in most cases, and with a slightly lower coverage). |
|
Both GramTrans and Apertium are rule-based. Very similar technology.
(I wrote CG-3, and work for both GrammarSoft and Apertium.)