Hacker News new | ask | show | jobs
by vintermann 1172 days ago
Yeah, its language skills are through the roof. There's no reason to talk to it in English. From what I can tell, it does a decent job of translating out of even languages like Southern Sami, with ~300 speakers and utterly neglible training corpus. It seems it knows enough about grammar from related languages, and can infer enough from context (and maybe even etymology) that it does an OK job.

I tested it by giving it some news articles from NRK Sápmi, and compare it with the Norwegian translation they have.

Edit: Seems I may have gotten lucky that time, it's being a lot more, um, creative in its translation now. Or for all I know it could be changes in the model.

2 comments

Looking at the basic ChatGPT (not GPT-4) while it can do reasonable translations for smaller languages and answer questions in them, the quality of the answers suffers significantly in my experience, if I ask the same factual question in two languages, I often see that the English one gets a correct answer while the small language gets a coherent hallucination. For big languages (French, Japanese, Spanish, etc) that's not an issue, but for the smaller ones it clearly is.
> There's no reason to talk to it in English.

Depends what you're doing. I haven't managed to make it continue after it stopped in the middle of a sentence in Japanese, but giving it the instruction to do so in English does. In some other cases, prompting in English (and asking for an answer in Japanese) can produce better results than giving the same prompt in Japanese.

reply "続けて" or "continue" works.

Generating Japanese is slower than English (it's annoying on GPT-4), that's my reason to prefer English sometimes (especially for tech topics). ChatGPT web users don't pay for each token, but API users pay for each token, so they would make different decision.

In my experience, while "continue" can work, "続けて" doesn't. At least not when making it rewrite large texts, which is when I hit the limit. With "continue", it continues rewriting. With "続けて", it tends to make up new text, that yes, is the continuation of what it was writing, but with no connection to the original text it was in the middle of rewriting.