|
|
|
|
|
by jupitr
160 days ago
|
|
This is quite seriously missing the point: computational theorem-provers and contributions to mathlib (for Lean, anyway) allow for checking "correctness". The LLM is one of the many ways to search for tactics that help complete a proof for a theorem not yet in mathlib. The LLM is not done until the theorem-prover says it is done - at which point we can be sure of the correctness of the assertion in the context. AFAIK there is no debate about inaccuracy with theorem-provers. Their real limitation continues to be converging in realistic timescales and efficient search over all combinations of proof tactics. |
|
Your "in the context" is doing a lot of heavy lifting here, see:
- The Math Is Haunted : https://news.ycombinator.com/item?id=44739315 : https://overreacted.io/the-math-is-haunted/
- Three ways formally verified code can go wrong in practice : https://news.ycombinator.com/item?id=45555727 : https://buttondown.com/hillelwayne/archive/three-ways-formal...
- Breaking “provably correct” Leftpad : https://news.ycombinator.com/item?id=45492274 : https://lukeplant.me.uk/blog/posts/breaking-provably-correct...
- what does a lean proof prove? : https://news.ycombinator.com/item?id=46286605 : https://supaiku.com/what-does-a-lean-proof-prove
> Their real limitation continues to be converging in realistic timescales and efficient search over all combinations of proof tactics.
I agree "efficient search over all combination of proof tactics" will be a defining moment in the field. I disagree these LLMs are it.