Hacker News new | ask | show | jobs
by vakkermans 930 days ago
Finding a proof is a search in a large space, akin to searching from abstractions to concretions. LLMs don’t do anything like this, and so you’re looking at the planning problem again. It’s not clear to me how framing this particular problem as a language problem is helpful in any case.
3 comments

Perhaps an LLM could produce a plausible-looking proof that is completely and utterly incorrect.

As a party trick.

Wouldn't that be the point of pairing it with Lean? You wouldn't get false positives.
Doesn’t mean you’d get true positives either. Garbage in, garbage out.
IIUC, any sensible way of "pairing up" these things will mean that anything you get out will be true. But the search might take millennia, and the outcome might be nothing (equivalently, "the LLM's conjecture is false").
This is equivalent to saying it's not going to work.
Playing chess & go is also search in a large tree of moves leading to particular game states
But AlphaGo etc don’t use any kind of language-based AI, so LLMs (which this thread was about) are no good.
The next step seems to be applying past advances in reinforcement learning with modern transformer based models
Which multiple teams are working on - OpenAI (Q*), and Meta just released a reinforcement learning framework
Could you point me towards Meta's reinforcement learning framework? I'd like to see how it stacks up against the OpenAI gym.
The final state in chess is a single* state which yes, then branches out to N checkmate configurations and then N*M one-move-from-checkmates, and so on. (*Technically it's won/lost/draw.)

The equivalent final state in theorem proving is unique to each theorem so such a system would need to handle an additional layer-of-generalization.

Is this how some of the more advanced chess engines work, or even the not so advanced ones, where there's a point at which it stops searching the forward move tree in greatest depth, and instead starts searching backwards from a handful of plausible (gross move limit-bound) checkmate states looking for an intersection with a shallow forward search state?
Kind of, but it's calculated offline and then just accessed during the game: https://www.chessprogramming.org/Endgame_Tablebases
Language models are currently the best AI systems that solve general reasoning problems, including mathematical ones. So it seems more than obvious to model the problem of finding proofs as a language problem.