| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by discobot 556 days ago
	the problem is that last generation of the largest models failed to overcome smaller models on the benchmarks, see lack of new claude opus or gpt-5. The problem is probably in the benchmarks, but anyway.