| > One major benefit that proof assistants have over other formal methods is that they can be used to prove essentially arbitrary properties of programs, which is not true of any automated methods. It is true of model checkers. > There are plenty of other examples, like the seL4 microkernel. Yep, there are plenty of such examples, I'd say 2 or 4, and they're all less than 10KLOC, i.e. 1/5 of jQuery. > I also do not understand what bearing computational complexity results have on humans manually formalizing their informal intuitions about program correctness in a proof assistant. Because humans can't outperform Turing machines. > Humans do not work on arbitrary problems where they have no intuition at all about why a program is correct, hence referring to worst-case computational complexity results does not make much sense here. For one, the results aren't just worst-case. One of the most important ones is on parameterized complexity. Second, problems that are very hard in the worst case have rarely been proven easy in the "real world" case, unless the real world case was somehow categorized in a neat subset, and this isn't the case here. It's possible that we're usually far from the worst case, but if so, automated methods could also exploit that, and still be better than manual ones in most cases. > In fact, if you have no intuition about why a program works, then how did you manage to write it in the first place? You surely did not write random garbage in the hopes that you would get a working program. This is a common assumption, but a wrong one: https://pron.github.io/posts/people-dont-write-programs > In this sense, proof assistants subsume all other formal methods, since they can be "scripted" to include them. The same can be said about all other formal methods. Indeed, various sound static analysis tools like Frama-C or optionally fall back to manual proofs when needed. In the end, though, in practice deductive proofs don't scale as well as other methods. It's a good tool to have in your toolbox, but you don't want it as your default. Your model checker could just as easily ask for help from a proof assistant. So it's not the case that any one method subsumes all others, but you could combine them. Which is why dependent types are problematic, as they make these combinations harder than, say, contract systems, that are flexible in the choice of verification method. > You may be thinking of a programming style where the type of a program encodes all the properties you are interested in, but you can avoid this style where necessary, by separating your program and its correctness proof. True, but that's not the style discussed here, of dependent Haskell or Idris. |
Humans regularly do "outperform Turing machines." Most of the complicated mathematical proofs that have been produced by humans, cannot currently be automatically produced using an algorithm in reasonable time. Pick almost any non-trivial mathematical theorem, and you would be hard pressed to write a program that could prove it in a reasonable amount of time given just a list of the axioms. In theory, humans might not be able to outperform TMs, but in practice they regularly do, as attested to by the large amounts of mathematical proofs that we have produced by hand.
Model checkers also outperform humans for some tasks, but then again this is because human intuition has been used to write the model checker in the first place. My point is that using theoretical computer science results to talk about the difficulty of finding mathematical proofs in real-world cases is a bit of a non-sequitur. Yes, the Linux kernel would be practically impossible to verify completely using semi-automated tools, because it would take too much time and effort. But is this because of some theoretical result about the model checking problem? I don't think so.
One difference between having a model checker call a proof assistant and using a proof assistant to implement a verified model checker is that if you write a verified model checker implementation, you can be almost 100% certain that it's bug-free, because it creates proof terms that can be independently checked by the (ideally small and simple) proof kernel of the proof assistant. You don't have to trust that the model checker has been implemented correctly. This difference does not matter that much in practice, however, since model checker implementations probably receive a lot of testing and are quite trustworthy.
You're right that I am mostly talking about Coq, and that a different style tends to be used in Haskell/Idris. Note that dependent Haskell is actually logically inconsistent for backwards-compatibility reasons anyway, so there's not much to lose there by cutting a few corners for pragmatic reasons. It is not necessary to verify every single part of the program. Dependent types in Haskell gives you the freedom to do more. You can choose not to use that freedom where it doesn't make sense.