| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by bernardjhuang 24 days ago
	I just finished the 100x benchmark across 4 frontier models here. Gemini 3.1 Pro + GPT 5.5 + Opus 4.7 are all quite similar but Grok is an odd ball: https://zonted.com/posts/three-of-four-ais-same-person/