Proof or Bluff? Evaluating LLMs on 2025 USA Math Olympiad

Y	Hacker News new \| ask \| show \| jobs

	Proof or Bluff? Evaluating LLMs on 2025 USA Math Olympiad (arxiv.org)
	6 points by mauriziocalo 448 days ago

1 comments

> Our results reveal that all tested models struggled significantly, achieving less than 5% on average