Hacker News new | ask | show | jobs
by pron 3119 days ago
> Agda-the-proof-assistant was hard to teach, mostly AFAICT because I don't know how to teach having mathematical ideas well.

One of the (several) problems I find with dependently-typed languages is that they don't easily allow you to separate specification from verification [1]. When using a language like TLA+, it's very easy to separate the two. I actually think it's easier to teach mathematical thinking with TLA+ than without any formal language at all, as you can get feedback as to what has gone wrong, i.e, you can "play" with the math, and use the model checker before writing a proof to get a counterexample. I agree that teaching how to think mathematically is still required and still hard, but there's much less accidental complexity in the way.

Another advantage is that complex concepts are only encountered later [2] (e.g., you can cleanly separate safety from the much more complex liveness). The only downside (for teaching) is that the proofs are declarative, so you don't get to teach actual syntactic manipulation of terms (which is, arguably, not important unless you want to teach low-level proofs). Of course, the computational theory is TLA rather than lambda calculus, so it's irrelevant if you want to teach that formal system in particular. On the other hand, advanced and very powerful concepts such as refinement are very elegant and relatively simple.

[1]: Another is that proof is pretty much the only form of verification (of specification given as types), which is both extremely costly and very rarely required, and yet another is that complicated matters are encountered very early on.

[2] I've been playing with Lean for some weeks, and I still don't understand what object `id` (the identity "function", polymorphic over all types in all universes) is; all I know is that it's not an actual function. I'm not sure whether it's the same or not in Agda.

4 comments

HN seems to have eaten my first reply, so here's an abbreviated version.

It's easy to separate specification from verification with dependent types. You can (and I did) show students how to take ordinary simply-typed functional programs, give a specification as a different type, and then do a verification by producing a term of the appropriate specification type. Then, you can use this to motivate intrinsically-typed strategies, and (a) show the students that intrinsic/extrinsic is a gradient rather than a dichotomy, and (b) show them how to exploit this gradient systematically.

As for liveness, in a purely functional language, basically the only interesting liveness property is termination, and this didn't pose a challenge for the students. Indeed, it was the opposite -- I got a lot of questions about why Haskell lacked a termination checker. (They had in their minds that termination checking was an impossible dream, and so seeing a system with a termination checker that was both usable and useful really fired their imaginations.)

Specs were still hard. A proof-based approach is actually desirable for teaching good spec-writing skills, because proofs are more sensitive to the details of the spec. Eg, property-based testcase generation is happy with the equation that the reverse of the reverse of a list is the same list, but it's too weak an induction hypothesis for a proof. So it strongly motivates you to get to the characterizing equation for reversal (that reversing the concatenation of two lists is the concatenation of their reverses in the opposite order). (And it's still good for testcase generation.)

Unfortunately, I don't know enough small, self-contained examples where you have to strengthen the induction hypothesis to get the proof to go through. Ideally I'd want about 60 or 70 good examples of this (to set problems for a course) and, alas, I'd be hard-pressed to come up with a tenth that many satisfying examples.

> You can (and I did) show students how to take ordinary simply-typed functional programs, give a specification as a different type, and then do a verification by producing a term of the appropriate specification type.

Sure, but disentangling the program from the spec and the proof is not quite what I meant. By separating specification from verification I mean using the ability to specify (which is a much more important skill), completely separately from how that specification is verified, whether by proof or by other, weaker means.

> Eg, property-based testcase generation is happy with the equation that the reverse of the reverse of a list is the same list, but it's too weak an induction hypothesis for a proof.

Ah, but that's exactly the benefit you gain from separating specification from verification. To write a proof of an algorithm in TLA+ you indeed need to find an inductive invariant. But in real life, when specifying large and complex systems (as I've experienced myself, and as reported by Amazon), finding an inductive invariant can be really, really hard. However, that does not make the specification useless, because it can still be checked with a model-checker. Indeed, most of the time, writing proofs of real-world programs is neither affordable nor required. So, I agree that teaching writing proofs is valuable, but it is no doubt secondary to teaching how to write specs. Being able to separate the two makes teaching and learning easier, and allows to focus on the more important skill first.

It's been reported by Amazon (and my experience was the same), that even an entry-level engineer can learn TLA+ and put it to good use in roughly two weeks, without any help outside the many tutorials. Those weeks are spent entirely on learning mathematical thinking in the context of computation, and make use only of the model-checker. After two weeks of learning TLA+, you're well into specifying complex, real-life systems, while after two weeks of learning, say, Lean, you're barely beyond learning the basics of quantification. Then, those who want can dive more deeply into writing proofs, but even there, the model-checker helps. For example, the model-checker can verify that your inductive invariant is, indeed, an inductive invariant (see below). In fact, Lamport's tutorial (the "hyperbook") is divided into a "principles track", a "specification track" and a "proof track". Inductive invariants are taught in the principles track because, despite being necessary for a proof, they are very useful for insight (which you can get even without a formal proof) as they form the core of why the algorithm works.

> Unfortunately, I don't know enough small, self-contained examples where you have to strengthen the induction hypothesis to get the proof to go through. Ideally I'd want about 60 or 70 good examples of this (to set problems for a course) and, alas, I'd be hard-pressed to come up with a tenth that many satisfying examples.

Yep, coming up with examples is very hard :) But TLA+ makes it easy to get an intuitive feel for the problem of the inductive invariant. In TLA+, the inductive invariant can be formally expressed generally. For a specification written in the "normal form" of:

    Spec ≜ Init ∧ □[Next] ∧ Liveness
Where `Init` specifies the initial condition, `Next` specifies the next-state relation (not a function!), and `Liveness` is the fairness condition which is only necessary if you care about liveness, an inductive invariant `I` of a correctness property `P` (i.e., we want to show `Spec ⇒ P`) is a predicate such that:

      I ⇒ P
    ∧ Init ⇒ I
    ∧ I ∧ Next ⇒ I'
(The logic contains the inference rule: `A ⇒ I, I ∧ B ⇒ I' ⊦ A ∧ □[B] ⇒ □I`)

The last conjunct (which states that if `I` holds in the current state, then any legal step would make `I` hold in the next) is the tricky part. Looking at it put in this way clearly shows the difficulty: a weak `I` may encompass too many states, many of which are unreachable, and so might not be an inductive invariant even though `P` is true. It is possible to check the inductive invariant in the model checker, by writing the specification, `I ∧ □[Next]` and checking whether it implies `I`. This, again, gives an intuition, as now `I` specifies the initial condition, and you can actually see (through the generated counterexamples) those "initial" states specified by `I` that are not reachable when starting from `Init`, and so you can see the weakness of the invariant in a very concrete way.

BTW, could you perhaps explain to me what kind of object `id` in Lean is? The problem is that there is no universe that contains all universes, so does that mean that `id` is merely syntax used to generate objects when types can be inferred, or does it represent some actual object in the theory?

I haven't used Lean, but looking at its documentation, it supports universe polymorphism, very much like Agda. This means that there are countably many named universe levels, with each numbered universe containing the universe below it.

All of these numbered universes live inside of a top-most universe which contains all the types and universes. You can't quantify over the topmost universe, for exactly the same reason you can't form the set of all sets in ordinary set theory.

If you do pencil-and-paper mathematics particularly pedantically, when you define something like id, you will say you are working in Tarski-Grothendieck set theory, and then be careful to call it a "definition schema", rather than a "definition", because you are "really" giving a family of definitions, once for each Grothendieck universe. Distinguishing schemas and definitions is a little inconvenient on a computer -- the naive implementation forces you to implement everything twice -- and so one extra top universe is introduced as a home for all of the smaller universes. (Though Mizar apparently chooses to support universe-generic definitions, and to explain them to users as schemas.)

IME, these issues are generally ignorable, both on paper and on computers. Harper and Pollack gave an algorithm for inferring universe levels in the early 90s, and apparently it worked so well that they were worried their implementation was wrong (ie, would always say OK). So they carefully coded up various set-theoretic paradoxes in type theory and were relieved to discover their implementation rejected them.

OK, so `id` is basically syntax (or access to the meta-language) for a schema. Thanks for explaining that! In Lean you can certainly feel this sometimes, as you can't even write the proposition `id = id` or `id id = id`. I guess there is no escaping such mechanisms in consistent formalisms, but I've found myself running against such subtleties much more often in Lean than in TLA+, and certainly far sooner (giving me a persistent feeling that I don't really get it). For example, the entire universe hierarchy is something to be aware of from the start (even though Prop is exempted from it), and is completely accidental complexity (necessitated by the chosen formalism, but still) from the perspective of someone just interested in specifying and verifying programs. I think that in most, though not all, cases, TLA+ has a somewhat better story in that regard. For example, definitions parameterized over the domain of the entire universe are syntactically distinct from functions (even in the use-site), and access to the meta-language, required in order to universally quantify over formulas when writing theorems/axioms that serve as inference rules (e.g. well-founded induction), is done in a way that's syntactically distinct from formulas.

When playing with Lean (and I guess the feeling may go away if I ever invest enough time to become proficient in it) I feel like fighting the formalism, which works hard to maintain consistency while having formulas represent "intensional" objects (meaning objects other than true/false), which is a feature that may be very interesting to logicians, but as a practitioner, it feels like paying a high price for something I absolutely don't need. HOL should be simpler, but the Isabelle/HOL tutorials are seriously lacking, and the whole quoting system there is a lot of accidental complexity that I didn't want to bother with.

> separate safety from the much more complex liveness

True, and with program logics, you can choose partial correctness vs total vs generalised correctness.

Itt's even worse in practise. With many contemporary dependently typed languages you'll have to worry about termination even during programming, even before thinking about verification.

The Curry-Howard prover advocates seem to regard this as a solved problem ("just distinguish programs from proofs by way of a typing system, problem solved") but has this solution been implemented in any mature systems as of 2017?

> use the model checker before writing a proof to get a counterexample.

Lovely, they should have taught me algebraic geometry that way. /s

Trying to write proofs without knowing a priori the truth value of the proposition to be proved builds character, and it's an important part of developing mathematical maturity. It forces you to develop intuition for what could or could not be true.

Programmers on a deadline don't want to spend months building character on every software component.
Replace programmers with mathematicians who already have character and intellectual integrity.
> to separate specification from verification

Exactly.

Program logics are superior in this regard: the encourage rather than inhibit separation of concerns. (Not to mention: program logics are more developed and cover a large class of computing paradigms, while dependent types have not grown much beyond the pure functions ghetto.)

I put forward this point to dependent types luminaries all the time. Its difficult to get more than polite silence in response.

      * * *
It's partly a social problem: The community working on dependently typed languages and tools typically have most of their programming experience with pure functional languages. Exaggerating only a bit, here is the typical trajectory of a formal methods researcher:

- As an undergraduate writes a lambda-calculus interpreter in Lisp/Scheme/Racket/Haskell

- As a PhD Student writes their own pure language with wacky typing system, and demonstrate its usefulness by embedding a lambda-calculus interpreter in it.

- As a postdoc verifies a lambda-calculus interpreter in Coq/Agda.

- As an assistant professor writes their own Curry-Howard-based interactive prover in some dependently typed language and demonstrate its usefulness by verifying a lambda-calculus interpreter in it.

- After tenure, gets their students to do the above.

2018's POPL has a type-theory in type-theory paper, but none on how programming language theory can improve SAT solving ...

It's an echo chamber. I recently reviewed a grant proposal by some of the more famous dependent types researchers. They proposed using HoTT to verify computational effects. The grant proposal claimed (1) that there is currently no known way to verify effectful languages and (2) that embedding generalisations of effect monads in HoTT is the way to overcome this problem. It seems like these people have not ever heard of program logics?

   * * *
A second reason is more serious: the real problem of formal verification is not this-language/that-logic. In day-to-day verification, it doesn't really matter that much whether you are using this/that approach. You need to set up the right definitions and invariants. That's the hard part. Whether you express them in some program logic or dependently typed system doesn't matter much, you can usually transliterate between them.

The real problem is automation.

Better autmation means dramatically better SAT solvers (modern SMT solvers are heavily reliant of SAT solvers). It's just unclear where 10x, 100x, 1000x speedups in SAT solvers should come from. We don't even know how to parallelise SAT solvers well. In the absence of dramatic progress in SAT solvers, what can we do other than tinkering with dependently typed languages and their relatives?

> here is the typical trajectory of a formal methods researcher

That's a programming language theory researcher. The vast majority of formal methods research is done neither using dependent types nor lambda calculus (and the vast majority of formal tools are similarly neither based on types nor on lambda calculus). However, I do agree that practitioners (i.e., programmers) who become interested in formal methods through functional programming, tend to miss out on the vast majority of progress in formal methods, which isn't related to functional programming at all, and indeed, is normally done in formalisms other than the lambda calculus (which introduces problems related to the common use of higher-order functions, which make some of the clever automation methods harder to apply). They also become familiar with equational reasoning, which works well in pure functional programs, and aren't aware of the more generally-applicable (and arguably more useful) assertional reasoning, which works well even in mainstream imperative languages. They may be aware of assertions, but may not know that their use is actually a formal method of sorts and is an application of Hoare logic.

> Better autmation means dramatically better SAT solvers (modern SMT solvers are heavily reliant of SAT solvers). It's just unclear where 10x, 100x, 1000x speedups in SAT solvers should come from. We don't even know how to parallelise SAT solvers well.

The problem is worse: we don't even know why SAT solvers work in practice at all.

> In the absence of dramatic progress in SAT solvers, what can we do other than tinkering with dependently typed languages and their relatives?

Well, much of formal methods research is done in the field of abstract interpretation (which forms the core of most sound static analysis tools). There is still a lot of interesting progress to be made there. Separation logic is an example of such relatively recent (and important!) progress.