Hacker News new | ask | show | jobs
by Groxx 1445 days ago
>REAL-WORLD APPLICATION

>Translating Wikipedia for everyone

Hmmm.

While there is very definitely utility in doing things like this, I do kinda fear "poisoning the well"-like effects of feeding (even partially-) AI-generated-data into extremely common AI-data-sources.

There's some info on it in a blog post[1] and the MediaWiki "Content translation" page[2], but does anyone know of any studies on the quality of the translations produced? I can absolutely see it being a huge time-saver for people who are essentially fluent in both (there's a lot of semi-mechanical drudgery in translating stuff like this that could be mostly eliminated)... but people are pretty darn good at choosing the easy option of trusting whatever they're given rather than being as careful as they should be. It kinda feels like it runs the risk of passively encouraging people to trust the machine's choice over their own, as long as it isn't obviously nonsense, and the cumulative effect could be rather large after a while.

[1]: https://diff.wikimedia.org/2021/11/16/content-translation-to...

[2]: https://www.mediawiki.org/wiki/Content_translation

2 comments

Yeah, I really hope they don't do this. I live in a country where I don't speak the language well, so I am using Google Translate and DeepL [0] all day every day. The quality of translations of real-world text is so incredibly variable. There is literally no way to know when it will suddenly reverse the meaning of a sentence, or produce something that sounds like it makes sense, but in terms of meaning bears no relation to the input at all.

A machine-translated Wikipedia would not be a trustworthy source of information at all, yet would look like one. I think that does significantly more harm than good.

[0] Suggestions for better alternatives welcomed.

On top of that - a lot of language specific content has to include sources in that same language.

(As an example, it would be absurd for lithuanian wikipedia to include sources in japanese - that would be not usable AND not usefull for the wikipedia readers, editors...)