Hacker News new | ask | show | jobs
by riku_iki 514 days ago
> If they had, surely they could have gotten themselves an even higher mark than 25%.

there is potentially some limitation of LLMs memorizing such complex proofs

1 comments

They aren't proofs, they're just numbers. All the questions have numerical answers. That's how they're evaluated.
I think those reasoning models are smart enough to not emit memorized answer if they can't come with CoT proof.

But OAI could draw any result, no one was checking, they probably were not brave enough to declare math as solved topic.