| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by DanMcInerney 214 days ago
	A 50% increase over ChatGPT 5.1 on ARC-AGI2 is astonishing. If that's true and representative (a big if), it lends credence to this being the first of the very consistent agentically-inclined models because it's able to follow a deep tree of reasoning to solve problems accurately. I've been building agents for a while and thus far have had to add many many explicit instructions and hardcoded functions to help guide the agents in how to complete simple tasks to achieve 85-90% consistency.

2 comments

I think it's due to improvements in vision basically, the arc agi 2 is very visual

Vision is very far from solved IMO, simple modifications to inputs results in high differences still, lines aren't recognized etc..

Where is this figure taken from?