Hacker News new | ask | show | jobs
Proof or Bluff? Evaluating LLMs on 2025 USA Math Olympiad (arxiv.org)
6 points by mauriziocalo 448 days ago
1 comments

> Our results reveal that all tested models struggled significantly, achieving less than 5% on average