Hacker News new | ask | show | jobs
by unhammer 1187 days ago
And then you let a human check it I hope? It does very well for the top 1% of languages (in terms of text online), but quality quickly degrades where there is less training material.

I asked a speaker of Northern Sámi, a language with not that big corpora available, to comment on GPT-4's translations into her language. She said "The translation is completely incomprehensible. Lots of non-existent and completely incomprehensible words, and the words that are understandable do not fit into the context. Besides, it's the wrong subject, it's Russia's report instead of the UN report etc." Only knowing a tiny bit of the language, I could've easily been fooled by the output.

1 comments

Yeah, it manages to produce intelligible output in Hungarian, but I've given the output to some native Hungarian speakers, and they're constantly telling me that it's making up words or using strange archaic words that they've barely ever heard used in regular speech.