Hacker News new | ask | show | jobs
by Karrot_Kream 3 days ago
Seems like Fable is doing a lot better on SWE-Bench-Pro and FrontierCode than GPT-5.5. Given how most folks I talk to and people instead online keep mentioning that GPT-5.5 was better than Opus, I'm curious what the experience now is like.
1 comments

It's a very nice bump, but it is in no way worth all the hype of the past month.