Hacker News new | ask | show | jobs
by zozbot234 42 days ago
> After reading another post about the most recent advances LLMs have made in finding and writing up novel, correct proofs, it sounds like the frontier models are now at the point of PhD student level.

This is somewhat misleading, the LLMs' contributions are in a limited niche of highly technical problem solving. They're neat but they're not the first mathematical theorem that gets automatically solved by a computer, that was done already in the 1990s.

> Maybe by using LLMs as a mighty tool and providing skilled usage and oversight?

Yes, even in the areas where LLMs are at their best, we'll still need a lot of human effort to make the results cleanly understandable. LLMs cannot do this well, even their generated papers have to be rewritten by human experts to surface the important bits.

1 comments

> done already in the 1990s by human-written programs that iterated through the finite casework that human thought had reduced the theorems to (four-colour theorem, FLT, etc.), which recent developments (eg. LLMs autonomously resolving Erdős problems) seem meaningfully distinct from. > human effort to make the results cleanly understandable well, perhaps loops of "derive proof through reasoning in English, formalise in Lean, use AST size of formal proof as a metric to optimise (via an LLM-guided search), translate back into English" could improve this? a lot of resources are being spent to make frontier LLMs more resistant to hallucinations via Lean, perhaps cogency will increase as a byproduct.