Hacker News new | ask | show | jobs
by whiplash451 128 days ago
We can really look at it both ways. It is actually concerning that a model that won IMO last summer would still fail 15% of ARC AGI 2.