Hacker News new | ask | show | jobs
by MaxBarraclough 1888 days ago
This post is about stateful formal methods. Formal methods can also be used with functional programming, right? Is it any easier to compose specifications in that case?
5 comments

For behavioural correctness of terminating functional programs, you can by and large treat your programs as plain mathematical functions. Then reasoning about them is as easy (or more likely, as hard) as ordinary math.

If you want to reason about the performance of functional programs, or about the behaviour of imperative/concurrent programs, then you need more sophisticated techniques.

But even in this case, specification composition (taken in the sense of the article, to mean the ability to state properties and proofs in a way which aligns with program module boundaries) has become a relatively routine technique in the last 5-10 years.

Basically, there are three main ingredients we seem to want: (1) you need to understand the abstract resources your program manipulates and how different parts of the program transfer ownership of these resources, (2) you need to understand the protocols (i.e., state machines) that your program components are participating, and (3) you need to understand the rely/guarantee invariants that the program components are maintaining.

Historically, verification techniques have had good stories for one or two of ownership/protocols/invariants (eg, TLA+ struggles with handling ownership), but the research community has developed techniques in the last few years that can handle all of them. As a result, there don't seem to be many small programs out of the reach of program verification -- compositional machine-checked proofs lock-free data structures over weak memory are a thing people can do these days. These proofs are by no means easy, but that's because the algorithms are fundamentally intricate rather than because our proof techniques suck.

We don't have any kind of proof that this is a complete inventory of the techniques we need to verify these kinds of programs. But there is the empirical fact that there are fewer and fewer small programs we don't know how to verify, so it's not an unthinkable possibility.

But the article doesn't deal with the possibility/impossibility of verification at all. It deals with the fact that (some?) specifications don't compose elegantly.

That is, that a sufficient specification for System A and a sufficient specification for System B can't be trivially composed into a sufficient specification for a System C = System A + System B. That's not to say that System C can't be specified, but that it can't be specified without re-creating the specifications for System A and System B.

If I'm being honest, a comment from /u/neel_k on this is enough to make me doubt my entire premise. He's a giant in the field and knows many, many things I don't.
Now you have to talk about what a specification is.

If it is a relation from some function input to its output, that would be one step. And of course, functions are composable that way.

But say you have a system behaviour that is a stream of function applications, e.g. [f(0), f(f(0)), f(f(f(0)))...] and I forgot the name for this function. This infinite stream is your system behaviour.

Now we have a different behaviour [g(0), g(g(0)), g(g(g(0)))...]

You can (more or less) easily reason about the relation of g^n(0) to f^n(0) where both streams or behaviours are synchronous. But reasoning about the behaviour of all g^n(0) in relation to all f^m(0) is difficult.

All the useful properties for these streams, liveness, fairness, as well as other invariants, are not easily encoded.

Co-inductive methods are useful for that sort of thing.

For example, we can prove two infinite streams are equal by showing that their definitions/generators are "bisimilar".

But the question is not whether their infinite extensions or limit are equal, but whether at each finite combination some property about the stream progress holds.
> and I forgot the name for this function.

I believe this is called `iterate` in Haskell.

The fact that CompCert, seL4, and similar projects exist is proof that a large amount of composition is possible if you have enough time to spend on it. It’s true that you can’t compose arbitrary formalizations, but you can architect your formalizations to be composable. Functional or stateful matters less than the engineering work required.
Not really, the problem described in the article could be rewritten to think of Next as a function World->World.
Though in that case requiring that Next blinks both 'x' and 'y' should obviously mean they are required to blink synchronously. Depending on how you model World you can either compose the two Next functions that blink x and y or you can multiply their respective worlds and use the product map.

None of these feel unnatural.

Sure, that's a natural way of reading this requirement, but is it the right way? How would you specify the system such that it can produce:

[(x, y), (!x, y), (!x, !y), (x, !y), (!x, y), ...]

(that is, any interleaving of the two blinks that preserves the constraint that there is at least one blink in each "step")

If you ask me requiring at least one blink in each step is a bit awkward as it is a global requirement as opposed to a local one.

But you can add the requirement Next(World) <> World if you really want to. Then you just get two sets:

- Next(World) <> World

- Next(World).x in (World.x, !World.x)

and

- Next(World) <> World

- Next(World).y in (World.y, !World.y)

and if you combine them you get

- Next(World) <> World

- Next(World).x in (World.x, !World.x)

- Next(World).y in (World.y, !World.y)

which seems to do what you want.

Adding global constraints seems like an easy way to shoot yourself in the foot though.

What the article calls specifications are exactly programs in a logical paradigm (not exactly FP, but you can translate into it).

My understanding is that the problem the article states is that in a synchronous system with a global clock, everything has to be synchronized to that clock. And that if you do not add cycle-number dependent logic, you can not have cycle-number dependent memory. I fail to see what it new there, or even non-obvious enough to warrant an article.