Hacker News new | ask | show | jobs
by akiarie 801 days ago
> If the annotation is basically just a copy of the function body why have any annotation at all?

This is a profound question. Xr0 annotations seem to be copies of the function bodies, but they aren't actually. What they are is a description of the function's safety semantics, the side effects and circumstances under which the function may be relied upon to exhibit well-defined behaviour (under the C Standard).

They only seem to be copies of the bodies because we're dealing with simple examples in the post here and in the tutorial, but we have a much more significant program ready (documenting) that we'll be sharing soon that shows the difference more clearly.

Xr0 is built in explicit reliance upon the powers of programmers to structure code in such a way that the annotations become simpler and simpler as you move up layers of function abstraction. As a simple example, think about what happens when you allocate memory and free it within a function, e.g.

        #include <stdlib.h>

        void
        foo() ~ [ return malloc(1); ]
        {
                return malloc(1);
        }

        void
        bar()
        {
                void *p = foo();
                free(p);
        }

        void
        baz()
        {
                bar(); /* completely safe */
        }
Note that `foo`'s annotation is like a copy of its body, but `bar`'s annotation is very simple, even though it interacts with `foo`, which is only safe if the pointer it returns is handled. So from `baz`'s perspective there are no safety semantics it has to worry about when it interacts with `bar`. The complexity that could lead to safety bugs in `foo` has been completely handled in `bar`.

We call this simplifying phenomenon denouement, and well-designed programs exhibit it very strongly. Our belief that programmers can design constructs that tend towards denouement as one moves down the call stack towards `main` is one of the basic reasons we're working on Xr0.

> It might not be clear to you, but if you want to allow any annotation with the same semantics as the body of the function you will run into this.

Yes, but we don't. Only well-defined behaviour according to the C Standard. This is the goal for Xr0 v1.0.0. After that the sky is the limit.

1 comments

> Xr0 annotations seem to be copies of the function bodies, but they aren't actually.

You're constantly dodging the question. If annotations aren't copies of function bodies (which is really the only sensible choice), you need to deduce whether a function body matches the annotations. "Structural similarity" between C and your annotation syntax won't make this easier. In fact, it is impossible to deduce whether two C functions are semantically equivalent even though C is extremely structurally similar to C (you could even argue that C and C are structurally identical).

> Our belief that programmers can design constructs that tend towards denouement as one moves down the call stack towards `main` is one of the basic reasons we're working on Xr0.

So your main criticisms of Rust will mostly also apply to Xr0: You need to rewrite existing C code to make annotations practical (and at that point you might as well reimplement it in Rust) and Xr0 will limit the constructs you will use in your code, because annotations will be impractical for code that isn't written for "denouement".

> If annotations aren't copies of function bodies (which is really the only sensible choice), you need to deduce whether a function body matches the annotations. "Structural similarity" between C and your annotation syntax won't make this easier. In fact, it is impossible to deduce whether two C functions are semantically equivalent even though C is extremely structurally similar to C (you could even argue that C and C are structurally identical).

Maybe I don't understand your point. I am not denying that we're deducing that the body matches the annotations. I'm simply saying that for the restricted case of safety semantics alone, this deduction can be done. Yes, it is impossible to deduce semantic equivalence of arbitrary functions. The question is whether it is possible to deduce that annotations capture the safety semantics of the body. If you have a concrete example it would be helpful here – but for the restricted concerns of safety, not for arbitrary constructs.

> So your main criticisms of Rust will mostly also apply to Xr0: You need to rewrite existing C code to make annotations practical (and at that point you might as well reimplement it in Rust) and Xr0 will limit the constructs you will use in your code, because annotations will be impractical for code that isn't written for "denouement".

Your quote omits the first sentence of the paragraph, which states that "well-designed programs exhibit [denouement] very strongly". Denouement in the sense we're referring to is an absolute theoretical necessity for any safe program, because at some level (of functional abstraction) the safety concerns must be handled, otherwise the program would have a safety vulnerability.

Xr0 definitely limits constructs, but our claim is that the limitation we're imposing is one that reflects the structure of all safe programs. The same cannot be said about Rust's ownership semantics, which limit an enormous number of simple, safe constructs. So the C programs to which one would be adding Xr0 annotations wouldn't need to be rewritten unless a bug has been discovered.

> I'm simply saying that for the restricted case of safety semantics alone, this deduction can be done.

And I'm saying it can't be done, at least not in a fundamentally less tedious and hard way than Frama-C.

> Yes, it is impossible to deduce semantic equivalence of arbitrary functions. The question is whether it is possible to deduce that annotations capture the safety semantics of the body.

It isn't in general.

> If you have a concrete example it would be helpful here – but for the restricted concerns of safety, not for arbitrary constructs.

Safety is not a "restricted concern": You can for every property P easily construct a function that is safe if and only if property P holds. I'll be using Python because this example (which I like) requires arbitrary size integers. You could obviously also implement this in C, you'd just need to implement arbitrary size integers (or use a library that implements them):

    def collatz(x):
        if x % 2 == 0:
            return x // 2
        return x * 3 + 1

    def cycle(f, x):
        tortoise = f(x)
        hare = f(f(x))

        while tortoise != hare:
            tortoise = f(tortoise)
            hare = f(f(hare))

        return hare

    buffer = [0, 0, 0, 0, 0]
    x = int(input())
    if x >= 1:
        collatz_cycle_element = cycle(collatz, x)
        print(buffer[collatz_cycle_element]) # this is safe (or is it?)
This takes as an input an arbitrary positive integer x, searches for a cycle in the Collatz sequence beginning with that number and returns an arbitrary element of that cycle. It is conjectured that for every positive integer this cycle will be 4 -> 2 -> 1 -> 4 -> ... and thus this program is safe if and only if the Collatz conjecture holds.

Another example would be a C program that generates random planar graphs, computes their chromatic numbers and then collects statistics about them in an `int statistics[5]`:

    #include <stdio.h>

    void main() {
        int statistics[5] = {0};

        for(int i = 0; i < 1000; i++) {
            struct graph g = generate_random_planar_graph();
            int chromatic_number = compute_chromatic_number(g);
            statistics[chromatic_number] += 1;
        }
        
        printf("%i, %i, %i, %i", statistics[1], statistics[2], statistics[3], statistics[4]);
    }
The safety of this program requires you to prove that `generate_random_planar_graph` always returns a planar graph, that `compute_chromatic_number` correctly identifies the chromatic number and that the chromatic number of every planar graph is less than 5.
Thought provoking stuff! I really appreciate how much effort you've put into this.

However, the main reason why this argument is flawed is it omits the heart of the matter: the annotations. Xr0 empowers programmers to propagate safety semantics. A program that is only safe if the Collatz conjecture holds is (surprise) only safe if the Collatz conjecture holds. So in Xr0 the only requirement we would impose is that this program be augmented with an annotation that communicates that it is only safe if the Collatz conjecture is true.

So the flaw in the reasoning is we haven't claimed that Xr0 can prove arbitrary programs are safe. We've claimed that Xr0 can prove the correspondence between the safety semantics denoted in an annotation and a function body. Above there are no annotations given which would specify "this program is safe only if the Collatz conjecture is true". It shouldn't be hard to prove the correspondence between such an annotation and the program you've written, e.g.:

    def main(): ~ [
        buffer = [0, 0, 0, 0, 0]
        x = int(input())
        if x >= 1:
            collatz_cycle_element = cycle(collatz, x)
            print(buffer[collatz_cycle_element])
    ]

        buffer = [0, 0, 0, 0, 0]
        x = int(input())
        if x >= 1:
            collatz_cycle_element = cycle(collatz, x)
            print(buffer[collatz_cycle_element])

It's the principle of propagating the safety-determining factors of the function that we're stressing, not some kind of almighty power to judge that arbitrary constructs are safe or not.
> It's the principle of propagating the safety-determining factors of the function that we're stressing

That's the main function. It's the function that gets called when the program starts. Where do you want to propagate the "safety-determining factors" to?

If I'm just supposed to read the annotations on the `main` function (and those annotations can basically just be a copy of it's body), then why do I need the annotations at all? I could also read the program to determine whether it's safe.

It's because we're dealing with an extreme example in which a program's safety depends on the resolution of an open problem.

If the program's safety depends on the semantics of C and how they've been used in a function, it will be possible to deal with all the conditions upon which a program could be unsafe (in the sense of the standard safety vulnerabilities – like those listed here [0]). Doing so would lead to denouement, so that the annotation to the main function would be empty (meaning none of those bugs can occur).

Programs that might be unsafe should not be verifiable.

[0]: https://alexgaynor.net/2020/may/27/science-on-memory-unsafet...

One final point – I cannot emphasise strongly enough that we aren't making this for Rust programmers. There's nothing wrong with loving a programming language that has changed the world and what is possible in systems programming. Rust is an advance for systems programming, and it's made incredible strides.

But some programmers (like myself) love working in C, and do not like Rust's restrictions. For such programmers even if the question was one of re-implementation (which we don't think it is) Rust would remain undesirable.