|
|
|
|
|
by curioussquirrel
60 days ago
|
|
One more thing: we're working on a multilingual benchmark that will evaluate core linguistic proficiency in 30 languages. We already have a lot of data internally and I can tell you that: - Gemini 3 Pro is a multilingual monster. - GPT-5.4 is a really good translation model, big improvements over previous subversions in the 5 family. - Opus 4.6 is good but usually third place. - Somehow, Grok 4.20 is surprisingly good at some long-tail languages? Its performance profile is really odd. Unlike all the other models. EDIT: layout |
|