| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by gertlabs 18 days ago

GPT 5.5 does significantly outperform Opus 4.7 in the coding parts of our evals.

We also incorporate live decision making on social games (where GPT 5.5 has actually regressed from earlier models, which shouldn't be a huge surprise if you ever tried talking it out of some of its nits).

We are still looking for a way to integrate "logical" intelligence with social intelligence in a less arbitrary way, so I'd take a look at the use case that applies to you (probably coding): https://gertlabs.com/rankings?mode=agentic_coding