Hacker News new | ask | show | jobs
by anthonypasq 59 days ago
arc agi isnt testing a models ability to store files and code things. its testings its ability to reason through puzzles given the same information as a human
2 comments

But that's the thing, as a human faced with a problem I'd often say "Sure, just let me get a pen, some paper and a calculator". Why shouldn't we make it easy for AIs to use their tools of choice?
if you tested my ability to reason and you gave me some challenging problems that involved arithmetic, it might be a better test if you gave me a scratch pad so I don't mess up the reasoning parts by failing arithmetic.