|
|
|
|
|
by ndm000
552 days ago
|
|
One thing I have not seen commented on is that ARC-AGI is a visual benchmark but LLMs are primarily text. For instance when I see one of the ARC-AGI puzzles, I have a visual representation in my brain and apply some sort of visual reasoning solve it. I can "see" in my mind's eye the solution to the puzzle. If I didn't have that capability, I don't think I could reason through words how to go about solving it - it would certainly be much more difficult. I hypothesize that something similar is going on here. OpenAI has not published (or I have not seen) the number of reasoning tokens it took to solve these - we do know that each tasks was thoussands of dollars. If "a picture is worth a thousand words", could we make AI systems that can reason visually with much better performance? |
|