Idk which models you refer to, but I tested a bunch recently, and they performed well on Dutch. Only the smallest, such as qwen 3.6 27B, made up words and switched languages.
There's a large gap between making up words and an actually native text distribution. LLMs have a clear pattern, clear tells, a "feel" in English, and it's normally even more pronounced in non-English languages.
Lots of bias towards English sentence structure, idioms, etiquette, etc.
I didn't notice any of that. Such a bias would be strange, because certainly smaller models don't have the luxury of learning grammar independently: it's still word sequences, and languages are quite well separated.
There would be a bunch of value in having, say, a good 30B-class model that used my local language as well as it does English. There's lots of cases, especially in the government sphere, where local processing is a requirement and frontier-level capabilities aren't required. Making those cheap to run seems like a fine goal.
Yes, but what's the point of a support bot that writes good Dutch when it can't follow instructions, doesn't understand the questions or can't solve problems? I might be wrong, but I don't think atm these models have the cognitive ability to perform any task in a satisfactory manner.
As for accessing pii, I imagine the value here is in the fact they're local, which has nothing to do with the "sovereignty" of these models. If anything, a model is more likely to be tricked by a malicious prompt the farther it is from the sota.
A good harness and engineering is important no matter which model you use.
But Sovereignty of hosting is also important because without it all pii is being leaked.
Lots of bias towards English sentence structure, idioms, etiquette, etc.