Hacker News new | ask | show | jobs
by lowefk 1186 days ago
I have been using GPT-4 to generate i18n files, and it is great. You can see this post to check GPT-4's translation capabilities: https://www.reddit.com/r/visualnovels/comments/11rty62/gpt4_...

I can simply feed in an en.i18n.json file, and it will generate i18n.json files for as many languages as I want. I don't use a specific prompt, but I occasionally include general information about the software in it.

Edit: I do verify the output by translating it back to English using Google translate, but it seems I need to be more careful.

1 comments

And then you let a human check it I hope? It does very well for the top 1% of languages (in terms of text online), but quality quickly degrades where there is less training material.

I asked a speaker of Northern Sámi, a language with not that big corpora available, to comment on GPT-4's translations into her language. She said "The translation is completely incomprehensible. Lots of non-existent and completely incomprehensible words, and the words that are understandable do not fit into the context. Besides, it's the wrong subject, it's Russia's report instead of the UN report etc." Only knowing a tiny bit of the language, I could've easily been fooled by the output.

Yeah, it manages to produce intelligible output in Hungarian, but I've given the output to some native Hungarian speakers, and they're constantly telling me that it's making up words or using strange archaic words that they've barely ever heard used in regular speech.