Hacker News new | ask | show | jobs
by TaupeRanger 1445 days ago
So they have a system that can translate to languages for which there isn't as much data as English, Spanish, etc. Waiting for a Twitter thread from a native speaker of one of these "low resource languages" to let us know how good the actual translations are. Cynically, I'd venture that they hired some native speakers to cherry pick their best translations for the story books. But mostly this just seems like a nice bit of PR (calling it a "breakthrough", etc.). I can't imagine this is going to help anyone who actually speaks a random, e.g., Nilo-Saharan language.
3 comments

If you're curious to try the system yourself, it's actually being used to help Wikipedia editors write articles for low-resource language Wikipedias: https://twitter.com/Wikimedia/status/1544699850960281601
How is the license of the models (CC NC) compatible with licenses used in Wikipedia? Did you sign an special agreement with the Wikimedia Foundation?
Twitter may not be representative imho because of the short text. It should first come to a problem of reliable language detection, and Twitter is quite often wrong there
in this work we tried to rely not only on automated evaluation scores but also on human evaluation for exactly this reason: we wanted to have a better understanding of how our model actually performs and how it correlates to automated scores.