Hacker News new | ask | show | jobs
by seanwilson 320 days ago
Human readable (informal) proofs are full of gaps that all have to be traced back to axioms e.g. gaps that rely on shared intuition, background knowledge and other informal proofs.

It's somewhat like taking rough pseudo code (the informal proof, a mixture of maths and English) and translating that into a bullet-proof production app (the formal proof, in Lean), where you're going to have to specify every step precisely traced back to axioms, handle all the edge causes, fix incorrect assumptions, and fill in the missing parts that were assumed to be straightforward but might not be.

A major part is you also have to formalise all the proofs your informal proof relied on so everything is traced back to the initial axioms e.g. you can't just cite Pythagorus theorem, you have to formalise that too.

So it's an order of magnitude more difficult to write a formal proof compared to an informal one, and even when you have the informal proof it can teams many years of effort.

2 comments

I’m almost certain this is ignorance on my part, but it seems like this would mean the proof is… possibly wrong? I mean if there are gaps and other informal proofs in there?

But I thought it was a widely celebrated result.

Kevin Buzzard told me that the worry that it might in fact be wrong is a huge motivator for him. I also once asked Serge Lang why so much mathematics is correct (which surprised me coming from programming where everything has bugs), and he said; “people do a large number of consistency checks beyond what is in the published proofs, which makes the chances the claimed results are correct much, much higher.” Another related quote Bryan Birch told me once: “it is always a good idea to prove true theorems.”
I suspect he is massively overestimating the reliability of obscure mathematics.
> Kevin Buzzard told me that the worry that it might in fact be wrong is a huge motivator for him.

And then, when I raised concerns in Zulip about Lean's metaprogramming facilities being used to trick the pipeline into accepting false proofs, he said the opposite. He even emphasized that the formalization efforts are not for checking proof correctness, but for cataloguing truths we believe in.

This kind of equivocation turned me away from that community, to be honest. That was an extremely frustrating experience.

For what it's worth, I don't think that Kevin Buzzard is the person you should talk to if you are interested in proof assistant design. As far as I know, Buzzard does not consider himself to be an expert in type theory or in proof assistants, and claims to be a mere user.
> “people do a large number of consistency checks beyond what is in the published proofs, which makes the chances the claimed results are correct much, much higher.”

I imagine one bias is because formal verification is such a huge effort, you're only going to do it for really interesting and impactful proofs, which means the proofs that get formal verified will already have been reviewed and critiqued a lot, and will be less likely to have glaring critical errors.

> I’m almost certain this is ignorance on my part, but it seems like this would mean the proof is… possibly wrong?

That's part of the motivation to formalise it. When a proof gets really long and complex, relies on lots of other complex proofs, and there's barely any single expert who has enough knowledge to understand all the branches of maths it covers, there's more chance there's a mistake.

There's a few examples here of errors/gaps/flaws found while formalizing proofs that were patched up:

https://mathoverflow.net/questions/291158/proofs-shown-to-be...

My understanding is it's common to find a few problems that then need to be patched or worked around. It's a little like wanting to know if a huge codebase is bug-free or not: you might find some bugs if you formalized the code, but you can probably fix the bugs during the process because it's generally correct. There can be cases where it's not fixable though.

I think it was Terrance Tao on the Lex Friedman podcast recently that said that there are very often little mistakes in big proofs, but they are almost always able to be patched around. Its like mathematicians' intuition is tracking some underlying reality and the actual formalization is flexible. Yes sometimes digging down into a small mistake leads to an unbridgeable gap and that route has to be abandoned, but uncannily often such issues have nearby solutions.
Also a lot of errors would be called "typos", not errors. Such as some edge cases missing in the theorem statement which technically makes the theorem false. As long as there's a similar theorem in the same spirit that can be proven, that's what the original was all along.
The level of rigor used in math if sometimes characterized as "sufficient to convince other mathematicians of correctness." So, yeah possibly, but not in a willy nilly way. It's not a proof sketch, it's a proof. It just isn't written in human language designed for communication.
Yes and when it was first published it was wrong (made leap of logic).

It takes thorough review by advanced mathematicians to verify correctness.

This is not unlike a code review.

Most people vastly underestimate how complex and esoteric modern research mathematics are.

The thing is though that MANY bugs slip through even the most thorough code reviews. As a security researcher, I can tell you there is literally no system out there that doesn't have such a bug in it.

The systems we deal with in software are massive compared with your typical mathematical framework though. But FLT is probably on similar scope.

> It's somewhat like taking rough pseudo code and translating that into a bullet-proof production app

That's actually where LLMs are already quite good at.

/s