Hacker News new | ask | show | jobs
by baigy 19 days ago
Have you tried Big 5 too?
1 comments

I just finished the 100x benchmark across 4 frontier models here. Gemini 3.1 Pro + GPT 5.5 + Opus 4.7 are all quite similar but Grok is an odd ball: https://zonted.com/posts/three-of-four-ais-same-person/