|
First of all, I would agree with you that full formalization of program correctness using dependent types is generally extremely costly in terms of time and money, and that there are often other formal methods which might get you results that are "good enough" while requiring much less effort and expertise. However, I do think that there are some inaccuracies in some of the statements you are making about proof assistants based on dependent types. One major benefit that proof assistants have over other formal methods is that they can be used to prove essentially arbitrary properties of programs, which is not true of any automated methods. Basically, if we as humans are able to convince ourselves that a program has a certain property, then we could also show this using a (dependently-typed) proof assistant. (Yes, theoretically all proof systems have limitations due to Gödel’s incompleteness theorem, but this limitation applies just the same to ALL formal methods. Furthermore, you’re not that likely to run into it in practice unless you are doing metatheory, and even there many interesting statements CAN still be shown.) Consider the formally verified C compiler CompCert, where the assembly output of the compiler has been proven to have the same behavior as the input source code. A proof like this simply cannot be done using automated methods currently, and requires human assistance. This kind of correctness proof is also highly non-compositional, since the correctness proof of a compiler does not mirror the structure of the compiler very closely. Hence this also invalidates your point about dependent types not being able to handle non-compositional properties very well. There are plenty of other examples, like the seL4 microkernel. I also do not understand what bearing computational complexity results have on humans manually formalizing their informal intuitions about program correctness in a proof assistant. Humans do not work on arbitrary problems where they have no intuition at all about why a program is correct, hence referring to worst-case computational complexity results does not make much sense here. In fact, if you have no intuition about why a program works, then how did you manage to write it in the first place? You surely did not write random garbage in the hopes that you would get a working program. You might say that there are programs you can check using a model checker, but which humans cannot reasonably keep in their heads or form an intuition about, and therefore these proofs cannot formalized in a proof assistant. But what you can do is implement a model checker in your proof assistant and then show that the model checker is correct. After you've done that, you can simply use the model checker to do your proofs for you. And of course it should be possible to formalize the model checker itself, since the people that wrote the model checker must have some intuition about why it works and produces the correct results.
In this sense, proof assistants subsume all other formal methods, since they can be "scripted" to include them. Furthermore, it is generally not the case that you have to rewrite your program in case you want to prove some more properties about it. You may be thinking of a programming style where the type of a program encodes all the properties you are interested in, but you can avoid this style where necessary, by separating your program and its correctness proof. |
It is true of model checkers.
> There are plenty of other examples, like the seL4 microkernel.
Yep, there are plenty of such examples, I'd say 2 or 4, and they're all less than 10KLOC, i.e. 1/5 of jQuery.
> I also do not understand what bearing computational complexity results have on humans manually formalizing their informal intuitions about program correctness in a proof assistant.
Because humans can't outperform Turing machines.
> Humans do not work on arbitrary problems where they have no intuition at all about why a program is correct, hence referring to worst-case computational complexity results does not make much sense here.
For one, the results aren't just worst-case. One of the most important ones is on parameterized complexity. Second, problems that are very hard in the worst case have rarely been proven easy in the "real world" case, unless the real world case was somehow categorized in a neat subset, and this isn't the case here. It's possible that we're usually far from the worst case, but if so, automated methods could also exploit that, and still be better than manual ones in most cases.
> In fact, if you have no intuition about why a program works, then how did you manage to write it in the first place? You surely did not write random garbage in the hopes that you would get a working program.
This is a common assumption, but a wrong one: https://pron.github.io/posts/people-dont-write-programs
> In this sense, proof assistants subsume all other formal methods, since they can be "scripted" to include them.
The same can be said about all other formal methods. Indeed, various sound static analysis tools like Frama-C or optionally fall back to manual proofs when needed. In the end, though, in practice deductive proofs don't scale as well as other methods. It's a good tool to have in your toolbox, but you don't want it as your default. Your model checker could just as easily ask for help from a proof assistant. So it's not the case that any one method subsumes all others, but you could combine them. Which is why dependent types are problematic, as they make these combinations harder than, say, contract systems, that are flexible in the choice of verification method.
> You may be thinking of a programming style where the type of a program encodes all the properties you are interested in, but you can avoid this style where necessary, by separating your program and its correctness proof.
True, but that's not the style discussed here, of dependent Haskell or Idris.