| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by boroboro4 437 days ago
	They updated the paper and included Gemini 2.5. It's the only model which got non trivial score (mostly solved one problem) - 10/42.