| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by maeil 517 days ago
	This just isn't accurate, on the overwhelming majority of real-world tasks (>90%) 3.5 Sonnet beats 4o. FWIW I've spoken with a friend who's at OpenAI and they fully agree in private.