Hacker News new | ask | show | jobs
by bogtog 226 days ago
They report benchmarks on the huggingface page (https://huggingface.co/utter-project/EuroLLM-9B)

They almost exclusively compare their model to prior models from 2024 or older and brag about "results comparable to Gemma-2-9B". I'm not sure what I expected. The eurollm.io homepage states "EuroLLM outperforms similar-sized models", which just seems like a lie for all practical purposes

An overly charitable interpretation is that EuroLLM isn't a reasoning model and has minimal post-training, so they sought out comparisons to such models (they're still ignoring reasoning models that have non-reasoning modes)

1 comments

> They almost exclusively compare their model to prior models from 2024

As another comment here noted, the title is missing (2024) - this model was released almost a year ago, last December, so it's not surprising that that's the models they compare to.