|
> What is the most complex property that has been established in TLA fully automatically with no hand tweaking etc for more than 50,000 LoCs? And how does that complexity compare to for example the complexity of the formula that has been used in the verification of SeL4? TLC, a model checker for TLA+, is commonly used to verify arbitrary properties -- at least as "deep" as those used in seL4, and often deeper -- of TLA+ specifications of a few thousand lines. Those properties can informally correspond to codebases with hundreds of thousands of lines, but there is currently no method for formally verifying arbitrary properties of 50KLOC, be it automatic or manual. Automated methods, however, are used either for deep properties (quantifier alterations) of few k-lines of spec/code or shallower (though still non-local and non-compositional) properties of 50-100KLOC. Manual deductive methods are virtually used for nothing outside of research projects/teams, except for local, compositional properties, of the kind expressible by simple type systems. I clarified further here: https://news.ycombinator.com/item?id=22146186 > So first-order logic is not deductive, because it doesn't yield a direct proof of FALSE? In any logic you can work deductively in the proof theory, or in the model theory. DPLL SAT solvers work in the model theory of first-order logic. Also, SAT solvers work on propositional logic, not first-order logic. Because of completeness, we know that every model-theoretic proof corresponds to a proof-theoretic proof, but that's not how most SAT algorithms work (except for resolution solvers). > There is an interesting question here: what is the precise meaning of "deductive"? It means using the syntactic inference rules of a proof theory, rather than the semantic rules of a model theory (possibly relying on completeness meta-theorems, as in the case of propositional logic, although that's not relevant for most interesting first-order theories). SMT solvers employ SAT, and can work on some first-order theories. I am not familiar with all the techniques SMT solvers use, but I believe they employ both proof-theoretic and model-theoretic techniques. In any event, as I mentioned before, SMT solvers are rarely used alone, because they're very limited. They are used as automated helpers for specific steps in proof assistants (TLAPS, the TLA+ proof assistant makes heavy use of SMT solvers for proof steps), for local properties of code (e.g. OpenJML for Java, in Dafny and in SPARK) or as components in CEGAR model checkers and in concolic test systems. |
One problem with model-checking is that you handle loops by replacing loops with approximations (unrolling the loop a few times) and in effect only verifying properties that are not affected by such approximations. In other words extremely shallow properties. (You may use different words, like "timeout" or "unsound technique" but from a logical POV, it's all the same ...)
All mathematics is deductive. ZFC is a deductive theory, HOL is a deductive theory, HoTT is a deductive theory. MLTT is a deductive theory, Quine's NF is a deductive theory.With that, what mathematicians call a model is but another deductive theory, e.g. the model theory of Peano Arithmetic happens in ZFC, another deductive theory. The deductive/non-deductive distinction is really a discussion of different kinds of algorithms. Deduction somehow involves building up proof trees from axioms and rules, using unification. It could be fair to say that concrete DPLL implementations (as opposed to textbook presentations) that are based on model enumeration, non-chronological backtracking, random restarts, clause learning, watched literals etc don't quite fit this algorithmic pattern. I am not sure exactly how to delineate deductive from non-deductive algorithms, that's why I think it's an interesting question.
I agree, but model checkers, type checkers for dependent types , modern testing technques, and (interactive) provers all tend to off-load at least parts of their work to SAT/SMT solvers which makes the opposition between deductive and non-deductive methods unclear. BTW I am not arguing against fuzzing, concolic, model checking testing etc. All I'm saying is that they too have scalability limits, just that the scale involved here is not lines of code.