|
HN seems to have eaten my first reply, so here's an abbreviated version. It's easy to separate specification from verification with dependent types. You can (and I did) show students how to take ordinary simply-typed functional programs, give a specification as a different type, and then do a verification by producing a term of the appropriate specification type. Then, you can use this to motivate intrinsically-typed strategies, and (a) show the students that intrinsic/extrinsic is a gradient rather than a dichotomy, and (b) show them how to exploit this gradient systematically. As for liveness, in a purely functional language, basically the only interesting liveness property is termination, and this didn't pose a challenge for the students. Indeed, it was the opposite -- I got a lot of questions about why Haskell lacked a termination checker. (They had in their minds that termination checking was an impossible dream, and so seeing a system with a termination checker that was both usable and useful really fired their imaginations.) Specs were still hard. A proof-based approach is actually desirable for teaching good spec-writing skills, because proofs are more sensitive to the details of the spec. Eg, property-based testcase generation is happy with the equation that the reverse of the reverse of a list is the same list, but it's too weak an induction hypothesis for a proof. So it strongly motivates you to get to the characterizing equation for reversal (that reversing the concatenation of two lists is the concatenation of their reverses in the opposite order). (And it's still good for testcase generation.) Unfortunately, I don't know enough small, self-contained examples where you have to strengthen the induction hypothesis to get the proof to go through. Ideally I'd want about 60 or 70 good examples of this (to set problems for a course) and, alas, I'd be hard-pressed to come up with a tenth that many satisfying examples. |
Sure, but disentangling the program from the spec and the proof is not quite what I meant. By separating specification from verification I mean using the ability to specify (which is a much more important skill), completely separately from how that specification is verified, whether by proof or by other, weaker means.
> Eg, property-based testcase generation is happy with the equation that the reverse of the reverse of a list is the same list, but it's too weak an induction hypothesis for a proof.
Ah, but that's exactly the benefit you gain from separating specification from verification. To write a proof of an algorithm in TLA+ you indeed need to find an inductive invariant. But in real life, when specifying large and complex systems (as I've experienced myself, and as reported by Amazon), finding an inductive invariant can be really, really hard. However, that does not make the specification useless, because it can still be checked with a model-checker. Indeed, most of the time, writing proofs of real-world programs is neither affordable nor required. So, I agree that teaching writing proofs is valuable, but it is no doubt secondary to teaching how to write specs. Being able to separate the two makes teaching and learning easier, and allows to focus on the more important skill first.
It's been reported by Amazon (and my experience was the same), that even an entry-level engineer can learn TLA+ and put it to good use in roughly two weeks, without any help outside the many tutorials. Those weeks are spent entirely on learning mathematical thinking in the context of computation, and make use only of the model-checker. After two weeks of learning TLA+, you're well into specifying complex, real-life systems, while after two weeks of learning, say, Lean, you're barely beyond learning the basics of quantification. Then, those who want can dive more deeply into writing proofs, but even there, the model-checker helps. For example, the model-checker can verify that your inductive invariant is, indeed, an inductive invariant (see below). In fact, Lamport's tutorial (the "hyperbook") is divided into a "principles track", a "specification track" and a "proof track". Inductive invariants are taught in the principles track because, despite being necessary for a proof, they are very useful for insight (which you can get even without a formal proof) as they form the core of why the algorithm works.
> Unfortunately, I don't know enough small, self-contained examples where you have to strengthen the induction hypothesis to get the proof to go through. Ideally I'd want about 60 or 70 good examples of this (to set problems for a course) and, alas, I'd be hard-pressed to come up with a tenth that many satisfying examples.
Yep, coming up with examples is very hard :) But TLA+ makes it easy to get an intuitive feel for the problem of the inductive invariant. In TLA+, the inductive invariant can be formally expressed generally. For a specification written in the "normal form" of:
Where `Init` specifies the initial condition, `Next` specifies the next-state relation (not a function!), and `Liveness` is the fairness condition which is only necessary if you care about liveness, an inductive invariant `I` of a correctness property `P` (i.e., we want to show `Spec ⇒ P`) is a predicate such that: (The logic contains the inference rule: `A ⇒ I, I ∧ B ⇒ I' ⊦ A ∧ □[B] ⇒ □I`)The last conjunct (which states that if `I` holds in the current state, then any legal step would make `I` hold in the next) is the tricky part. Looking at it put in this way clearly shows the difficulty: a weak `I` may encompass too many states, many of which are unreachable, and so might not be an inductive invariant even though `P` is true. It is possible to check the inductive invariant in the model checker, by writing the specification, `I ∧ □[Next]` and checking whether it implies `I`. This, again, gives an intuition, as now `I` specifies the initial condition, and you can actually see (through the generated counterexamples) those "initial" states specified by `I` that are not reachable when starting from `Init`, and so you can see the weakness of the invariant in a very concrete way.
BTW, could you perhaps explain to me what kind of object `id` in Lean is? The problem is that there is no universe that contains all universes, so does that mean that `id` is merely syntax used to generate objects when types can be inferred, or does it represent some actual object in the theory?