| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by Eridrus 27 days ago
	Nobody releases numbers that show them to be worse than competitors lol. This even applies to OpenAI & Anthropic who don't even eval on the same datasets a lot of the time.

1 comments

Ey7NFZ3P0nzAe 27 days ago

I do recall mistral doing this. It's not always about being the best, but also fastest or smallest. The user should have all the information for its own use case.

link

Eridrus 26 days ago

If your model doesn't actually show the tradeoff you're getting for speed, you're doing marketing and not benchmarking.

Which is fine, we all have to make money, but it is disingenuous. It's just unfortunate that running some of these benchmarks is so expensive that it's not really realistic for most companies to actually run them.

link