|
|
|
|
|
by bogtog
226 days ago
|
|
They report benchmarks on the huggingface page (https://huggingface.co/utter-project/EuroLLM-9B) They almost exclusively compare their model to prior models from 2024 or older and brag about "results comparable to Gemma-2-9B". I'm not sure what I expected. The eurollm.io homepage states "EuroLLM outperforms similar-sized models", which just seems like a lie for all practical purposes An overly charitable interpretation is that EuroLLM isn't a reasoning model and has minimal post-training, so they sought out comparisons to such models (they're still ignoring reasoning models that have non-reasoning modes) |
|
As another comment here noted, the title is missing (2024) - this model was released almost a year ago, last December, so it's not surprising that that's the models they compare to.