| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by int_19h 604 days ago
	Benchmarks are way too easy to game. There's no shortage of models that "beat GPT-4" according to some benchmark or another, that are obviously nowhere even close when you try them on novel tasks.