Hacker News new | ask | show | jobs
by DanMcInerney 214 days ago
A 50% increase over ChatGPT 5.1 on ARC-AGI2 is astonishing. If that's true and representative (a big if), it lends credence to this being the first of the very consistent agentically-inclined models because it's able to follow a deep tree of reasoning to solve problems accurately. I've been building agents for a while and thus far have had to add many many explicit instructions and hardcoded functions to help guide the agents in how to complete simple tasks to achieve 85-90% consistency.
2 comments

I think it's due to improvements in vision basically, the arc agi 2 is very visual
Vision is very far from solved IMO, simple modifications to inputs results in high differences still, lines aren't recognized etc..
Where is this figure taken from?