|
Software gets more complicated, and so bugs become more common. Hardware gets faster, so speed pressures are reduced on software, so it gets more layered and hence slower; and also more complicated. More layers of abstraction means more things to get wrong, so correctness becomes a serious problem. The world is increasingly reliant on software, so correctness becomes a serious concern. That sums up the "software crisis" as it relates to correctness. Most theorem provers have been swept up in this. They are written in high level languages and frameworks made by people who paid no thought to verification and only slightly more than most to correctness. What's more, even some theorem prover languages have dubious correctness or semantics; Rust and Dafny are both languages that make a big deal about formal correctness but have no formal spec. (Rust is working on it.) Coq is not known to be consistent (the literature is full of "here is a proof that some subsystem of Coq has this and that property" but no one can handle the whole thing), not to mention that it has had a few soundness bugs in its history, one of which (Reals violate EM) wasn't even clearly a bug because no one knew what the model was supposed to be. Agda is a free for all as far as I can tell, and Isabelle has some strange thing with axiomatic typeclasses that takes them out of the HOL world. HOL Light is pretty decent metatheory-wise, and the logic at least was formalized inside itself, but the implementation has not, and OCaml's Obj.magic is still there, ready to cause inconsistency if you feel like using it. Lean has not had any soundness bugs to date, and the theory is well understood, but it is a massive C++ application and there are doubtless dozens of implementation bugs in there waiting to be found. None of the big theorem provers are really appropriate for trying to prove correct, because either it is not known what this even means or it's trivially false and we all try to pretend otherwise, or it's an intractable open question. These are the paragons of correctness in our world. Can we do better? What does it even mean to do better? This paper aims to find out. |
(btw I couldn't find the source code for the proof checker in the paper, is it available?)