Hacker News new | ask | show | jobs
by Cladode 2335 days ago

   better scalability than 
   deductive
This is misleading. There are two notions of scalability:

- Scalable to large code bases.

- Scalable to deep properties.

Deductive methods are currently the only ones that scale to deep properties. Model checking et al are currently the much better at scaling to large code bases. Note however two things: First, Facebook's infer tool which scales quite well to large code bases, is partly based on deductive methods. Secondly, and most importantly, under the hood in most current approaches, SAT/SMT solvers do much of the heavy lifting. Existing SAT/SMT solvers are invariably based on DPLL, i.e. resolution, i.e. a deductive method.

1 comments

> Deductive methods are currently the only ones that scale to deep properties.

Model checkers check deep properties with overall significantly less effort. It is true that they don't always succeed, but neither do deductive proofs. In industry, deductive methods are used as a last resort, when they're used at all, to tie together results from automated techniques.

> Note however two things: ...

That's not what I meant by "deductive methods;" I meant verification by semi-automated deductive proofs. Also, SMT solvers aren't used in isolation to verify significant pieces of software, they're used as components of model checkers, concolic testing and proof assistants (or to verify local properties of code units), so they're used in both semi-automated deductive methods as well as in automated methods. I also believe that DPLL is based on assignment and backtracking, not resolution.

   Model checkers check deep 
Which deep properties have you got in mind?

   DPLL is based
DPLL is based on a form of resolution, in real implementations you mostly simply enumerate models, and backtrack (maybe with some learning) if you decided to abandon a specific model.
> Which deep properties have you got in mind?

TLC is a model checker can check most properties expressible in TLA+, which are more expressive than any base lambda-calculus theory (e.g. it includes temporal refinement properties). See my overview on TLA+'s theory: https://pron.github.io/tlaplus

> in real implementations you mostly simply enumerate models, and backtrack (maybe with some learning) if you decided to abandon a specific model

So not deductive. You know that because DPLL does not directly yield a proof of UNSAT.

   can check most properties 
   expressible in TLA+
Lamport's TLA contains ZF set theory. That makes TLA super expressive. Unless a major breakthrough has happened in logic that I have not been informed about, model checkers cannot verify complex TLA properties for large code bases fully autom atically. Let's be quantitative in our two dimensions of scalability:

- Assume we 'measure' the complexity of a property by the number of quantifier alternations in the logical formula.

- Assume we 'measure' the complexity of a program by lines of code.

(Yes, both measures are simplistic.) What is the most complex property that has been established in TLA fully automatically with no hand tweaking etc for more than 50,000 LoCs? And how does that complexity compare to for example the complexity of the formula that has been used in the verification of SeL4?

   So not deductive. 
So first-order logic is not deductive, because it doesn't yield a direct proof of FALSE?

There is an interesting question here: what is the precise meaning of "deductive"? Informally it means: building proof trees by instantiating axiom and rule scheme. But that is vague. What does that mean exactly? Are modern SAT/SMT solvers doing this? The field of proof complexity thinks of (propositional) proof systems simply as poly-time functions onto propositional tautologies.

> What is the most complex property that has been established in TLA fully automatically with no hand tweaking etc for more than 50,000 LoCs? And how does that complexity compare to for example the complexity of the formula that has been used in the verification of SeL4?

TLC, a model checker for TLA+, is commonly used to verify arbitrary properties -- at least as "deep" as those used in seL4, and often deeper -- of TLA+ specifications of a few thousand lines. Those properties can informally correspond to codebases with hundreds of thousands of lines, but there is currently no method for formally verifying arbitrary properties of 50KLOC, be it automatic or manual. Automated methods, however, are used either for deep properties (quantifier alterations) of few k-lines of spec/code or shallower (though still non-local and non-compositional) properties of 50-100KLOC. Manual deductive methods are virtually used for nothing outside of research projects/teams, except for local, compositional properties, of the kind expressible by simple type systems.

I clarified further here: https://news.ycombinator.com/item?id=22146186

> So first-order logic is not deductive, because it doesn't yield a direct proof of FALSE?

In any logic you can work deductively in the proof theory, or in the model theory. DPLL SAT solvers work in the model theory of first-order logic. Also, SAT solvers work on propositional logic, not first-order logic. Because of completeness, we know that every model-theoretic proof corresponds to a proof-theoretic proof, but that's not how most SAT algorithms work (except for resolution solvers).

> There is an interesting question here: what is the precise meaning of "deductive"?

It means using the syntactic inference rules of a proof theory, rather than the semantic rules of a model theory (possibly relying on completeness meta-theorems, as in the case of propositional logic, although that's not relevant for most interesting first-order theories). SMT solvers employ SAT, and can work on some first-order theories. I am not familiar with all the techniques SMT solvers use, but I believe they employ both proof-theoretic and model-theoretic techniques. In any event, as I mentioned before, SMT solvers are rarely used alone, because they're very limited. They are used as automated helpers for specific steps in proof assistants (TLAPS, the TLA+ proof assistant makes heavy use of SMT solvers for proof steps), for local properties of code (e.g. OpenJML for Java, in Dafny and in SPARK) or as components in CEGAR model checkers and in concolic test systems.

   TLC ... is commonly used to ... 
   at least as "deep" as those 
   used in seL4, and often deeper
What you are in essence implying here is that the SeL4 verification can be handled fully automatically by TLC. I do not believe that without a demonstration ... and please don't forget to collect your Turing Award!

One problem with model-checking is that you handle loops by replacing loops with approximations (unrolling the loop a few times) and in effect only verifying properties that are not affected by such approximations. In other words extremely shallow properties. (You may use different words, like "timeout" or "unsound technique" but from a logical POV, it's all the same ...)

    the model theory ...
    rather than the semantic 
    rules of a model theory
All mathematics is deductive. ZFC is a deductive theory, HOL is a deductive theory, HoTT is a deductive theory. MLTT is a deductive theory, Quine's NF is a deductive theory.

With that, what mathematicians call a model is but another deductive theory, e.g. the model theory of Peano Arithmetic happens in ZFC, another deductive theory. The deductive/non-deductive distinction is really a discussion of different kinds of algorithms. Deduction somehow involves building up proof trees from axioms and rules, using unification. It could be fair to say that concrete DPLL implementations (as opposed to textbook presentations) that are based on model enumeration, non-chronological backtracking, random restarts, clause learning, watched literals etc don't quite fit this algorithmic pattern. I am not sure exactly how to delineate deductive from non-deductive algorithms, that's why I think it's an interesting question.

   SMT solvers are rarely used alone,
I agree, but model checkers, type checkers for dependent types , modern testing technques, and (interactive) provers all tend to off-load at least parts of their work to SAT/SMT solvers which makes the opposition between deductive and non-deductive methods unclear.

         * * * 
BTW I am not arguing against fuzzing, concolic, model checking testing etc. All I'm saying is that they too have scalability limits, just that the scale involved here is not lines of code.