Hacker News new | ask | show | jobs
by jeffbee 435 days ago
Odd that ETHZ authors published less than a week ago excluding Gemini 2.5

"PROOF OR BLUFF? EVALUATING LLMS ON 2025 USA MATH OLYMPIAD"

https://files.sri.inf.ethz.ch/matharena/usamo_report.pdf

1 comments

Gemini 2.5 Pro was only released on 25th of March ;)
They updated the paper and included Gemini 2.5. It's the only model which got non trivial score (mostly solved one problem) - 10/42.