Hacker News new | ask | show | jobs
by Schlagbohrer 43 days ago
After reading another post about the most recent advances LLMs have made in finding and writing up novel, correct proofs, it sounds like the frontier models are now at the point of PhD student level. I wonder how a math student could contribute today, if they're just starting on the PhD track? Maybe by using LLMs as a mighty tool and providing skilled usage and oversight?

It must feel similar to those who wanted to become chess or go masters after computers surpassed humanity in those games.

5 comments

> After reading another post about the most recent advances LLMs have made in finding and writing up novel, correct proofs, it sounds like the frontier models are now at the point of PhD student level.

This is somewhat misleading, the LLMs' contributions are in a limited niche of highly technical problem solving. They're neat but they're not the first mathematical theorem that gets automatically solved by a computer, that was done already in the 1990s.

> Maybe by using LLMs as a mighty tool and providing skilled usage and oversight?

Yes, even in the areas where LLMs are at their best, we'll still need a lot of human effort to make the results cleanly understandable. LLMs cannot do this well, even their generated papers have to be rewritten by human experts to surface the important bits.

> done already in the 1990s by human-written programs that iterated through the finite casework that human thought had reduced the theorems to (four-colour theorem, FLT, etc.), which recent developments (eg. LLMs autonomously resolving Erdős problems) seem meaningfully distinct from. > human effort to make the results cleanly understandable well, perhaps loops of "derive proof through reasoning in English, formalise in Lean, use AST size of formal proof as a metric to optimise (via an LLM-guided search), translate back into English" could improve this? a lot of resources are being spent to make frontier LLMs more resistant to hallucinations via Lean, perhaps cogency will increase as a byproduct.
If your motivation is being recognized as the best of the best, winning the competition, yes it’s probably a bleak world. But if you motivation is improving your own capabilities, with the metric being if you’re better know then you were last month, then it’s not a bleak world, there are many more tools available to help you learn and improve now then there were in the past.
LLM models can only predict the next token.

The can't predict the consequences of an action predicting one token after another. They can't solve a Rubik's Cube unlike a 7 year old human who can learn to do it in a weekend. They can't imagine the perspective of being a human being unlike a 7 year old human if asked to imagine they where in the position of another human.

Those are very strong claims, do you really believe an LLM can't be trained to solve Rubik's Cubes?

Can you imagine what if feels like to be a LLM?

Can one LLM have a better sensation of what it feels like to be a different LLM (say one that score a little better?)?

You design circularly defined criteria...

honestly I'm pretty sure opus could solve a rubiks cube if you just gave it the layout if the sides and looped until it would solve it

or even just take a picture of the thing, since they can digest visual input now

I wonder if AI is one means to overcome the natural limits of human knowledge aggregation [0].

On the other hand, in the very long run, what does it mean if a talented human being does not have enough years of life to fully analyze and understand an extremely advanced proof created by AI?

[0]: https://slatestarcodex.com/2017/11/09/ars-longa-vita-brevis/

Perhaps it will become like those cathedrals that took centuries and many generations of humans to build.
Yes, but you (as a human) can still understand the cathedral (the building). This is not guaranteed for advanced AI work in mathematics in the future. If so, are we/they are really still adding to human knowledge, at this stage?
Mathematics as an aggregate already is that cathedral. It is grander and more beautiful than any earthly cathedral.
The Mathoverflow question was asked 15 years ago. The top answer says that the human community part is very important and spreading knowledge an critical thinking is valuable.

The most recent advances are stunts by a handful of famous prompters who are funded in various ways by the LLM industrial complex.

How many theorems are proven by mathematicians each year? Let's guess 10000. Then the Erdos toy proofs with unknown token and resource usage are less than 1%.

...And in 1900, how many carriages were horseless?
In 2026, how many people X-ray their feet at the shoe store or have watches with radium paint?

Ironically, there is a shoe company pivoting to AI. My taxi driver told me buy the stock:

https://www.bbc.com/news/articles/c98mrepzgj7o