|
|
|
|
|
by vidarh
1161 days ago
|
|
I posted on another thread that not only does GPT4 handle Norwegian just fine (0.1% of training data for GPT3), but Norway has two official languages that are mutually intelligible and close enough that some would consider them dialects, but GPT can handle Nynorsk, the smaller of the two (Bokmål being the other) just fine. Going one step further, I asked it to "translate" into both "Riksmål", an artificial conservative variant of Bokmål that basically rejects most of the last few decades worth of language reforms, as well as Romeriksdialect (dialect from the Eastern part of Norway)... For the latter it gave me a lecture about how it varies internally in the region (which is correct) and presented a "translation" of a test sentence that is recognisably one of the variants from the Northern part of the region. Of course for these competency definitely bleeds over. They share an almost identical grammar and a majority of orthography, but I'm impressed enough it can handle Norwegian that well at all, much less that it knows the distinctions between the variants. |
|
I tested it by giving it some news articles from NRK Sápmi, and compare it with the Norwegian translation they have.
Edit: Seems I may have gotten lucky that time, it's being a lot more, um, creative in its translation now. Or for all I know it could be changes in the model.