| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by gbickford 551 days ago
	Small models don't "know" as much so they hallucinate more. They are better suited for generations that are based in a ground truth, like in a RAG setup. A better comparison might be Flash 2.0 vs 4o-mini. Even then, the models aren't meant to have vast world knowledge, so benchmarking them on that isn't a great indicator of how they would be used in real-world cases.

1 comments

ipsum2 551 days ago

Yes, it's not an apples to apples comparison. My point is the position it's at on the lmarena leaderboard is misplaced due to the hallucination issues.

link