Hacker News new | ask | show | jobs
by n4r9 582 days ago
There's the Frontier Math benchmarks [0] demonstrating that AI is currently quite far from human performance at research-level mathematics.

[0] https://arxiv.org/abs/2411.04872

1 comments

They didn't demonstrate anything. They haven't even released their dataset, nor mentioned how big it is.

It's just hot air, just like the AlphaProof announcement, where very little is know about their system.

They won't publish the problem set for obvious reasons. And I doubt it's hot air, given the mathematicians involved in creating it.