Hacker News new | ask | show | jobs
by shrx 703 days ago
From the article:

> If the verifier finds any unsafe code, the program is rejected and not executed. The verifier is rigorous -- the Linux implementation has over 20,000 lines of code [0] -- with contributions from industry (e.g., Meta, Isovalent, Google) and academia (e.g., Rutgers University, University of Washington).

[0] links to https://github.com/torvalds/linux/blob/master/kernel/bpf/ver... which has this interesting comment at the top:

    /* bpf_check() is a static code analyzer that walks eBPF program
     * instruction by instruction and updates register/stack state.
     * All paths of conditional branches are analyzed until 'bpf_exit' insn.
     *
     * The first pass is depth-first-search to check that the program is a DAG.
     * It rejects the following programs:
     * - larger than BPF_MAXINSNS insns
     * - if loop is present (detected via back-edge)
    ...
I haven't inspected the code, but I thought that checking for infinite loops would imply solving the halting problem. Where's the catch?
13 comments

I'm not able to comment on what this code is doing, but as for the theory:

The halting problem is only unsolvable in the general case. You cannot prove that any arbitrary piece of code will stop, but you can prove that specific types of code will stop and reject anything that you're unable to prove. The trivial case is "no jumps"—if your code executes strictly linearly and is itself finite then you know it will terminate. More advanced cases can also be proven, like a loop over a very specific bound, as long as you can place constraints on how the code can be structured.

As an example, take a look at Dafny, which places a lot of restrictions on loops [0], only allowing the subset that it can effectively analyze.

[0] https://ece.uwaterloo.ca/~agurfink/stqam/rise4fun-Dafny/#h25

Adding on (and it's not terribly relevant to eBPF), it's also worth noting that there are trivial programs you can prove DON'T halt.

A trivial example[1]:

    int main() {
        while (true) {}
        int x = foo();
        return x;
    }
This program trivially runs forever[2], and indeed many static code analyzers will point out that everything after the `while (true) {}` line is unreachable.

I feel like the halting problem is incredibly widely misunderstood to be similar to be about "ANY program" when it really talks about "ALL programs".

[1]: In C++, this is undefined behavior technically, but C and most other programming languages define the behavior of this (or equivalent) function.

[2]: Fun relevant xkcd: https://xkcd.com/1266/

EDIT: I am incorrect, please ignore. (Original text below, for posterity).

Nit: In many languages, doesn't this depend on what foo() does? e.g:

  foo() {
    exit(0);
  }
No? The foo() invocation is never reached because the while loop never terminates.
Apologies; I misread the function call as being inside the loop.
The halting problem cannot be solved in the general case, but in many cases you can prove that a program halts. eBPF only allows verifiably-halting programs to run.
the halting problem is only true for _arbitrary_ programs

but there are always sets of programs for which it is clearly possible to guarantee their termination

e.g. the program `return 1+1;` is guaranteed to halt

e.g. given program like `while condition(&mut state) { ... }` with where `condition()` is guaranteed to halt but otherwise unknown is not guaranteed to halt, but if you turn it into `for _ in 0..1000 { if !condition(&mut state) { break; } ... }` then it is guaranteed to halt after at most 1000 iterations

or in other words eBPF only accepts programs which it can proof will halt in at most maxins "instruction" (through it's more strict then my example, i.e. you would need to unroll the for-loop to make it pass validation)

the thing with programs which are provable halting is that they tend to also not be very convenient to write and/or quite limited in what you can do with them, i.e. they are not suitable as general purpose programming languages at all

Infinite loops are not possible and would get rejected by the verifier since it cannot solve the halting problem. Here is a good overview on the options available: https://ebpf-docs.dylanreimerink.nl/linux/concepts/loops/
The basic logic flags any loop ("back-edge").
This, others have said it less concisely, but a program without loops and arbitrary jumps is guaranteed to halt if we assume the external functions it calls into will halt.
eBPF is not Turing complete. Writing it is very annoying compared to writing normal C code for exactly this reason.
I'm glad to hear that Meta and Google code is "rigorous". I'd prefer INRIA, universities that fund theorem provers, industries where correctness matters like aerospace or semiconductors.
Windows doesn't use the Linux eBPF verifier, they have their own implementation named PREVAIL[0] that is based on an abstract interpretation model that has formal small step semantics. The actual implementation isn't formally proven, however.

0: https://github.com/vbpf/ebpf-verifier

Also that lines of code is a proxy for rigor, something new I learned today. /s
I think they mean that the code base is small enough to be audited thoroughly. Maybe they should reword it to be clearer.
> I think they mean that the code base is small enough to be audited thoroughly.

They wouldn't say it was "over 20,000 lines" in that case. And 20,000 lines of C is far too big to audit.

The halting problem is exhaustive, there isn't an algorithm that is valid for all programs. You can still check for some kinds of infinite loops though!
More specifically, you can accept a set of programs that you are certain do halt, and reject all others, at the expense of rejecting some that will halt. As long as that set is large enough to be practical, the result can be useful. If you eg forbid code paths that jump "backwards", you can't really loop at all. Or require loops to be bounded by constants.
I have no insight into this particular project but you could work around the halting problem by only allowing loops you can proof will not go infinite. That would of course imply rejecting loops that won't go infinite but can't be proven not to.
If the verifier can't determine that the loop will halt, the program is disallowed. Also, if the program gets passed and then runs too long anyway, it's force-halted. So... I guess that solves the halting problem.
It's more accurate to say that in principle, there could be programs that would halt, but that the verifier will deny.
So this "solves" the halting problem by creating a new class "might-not-halt-but-not-sure" and lumping it with "does-not-halt". I find it hard to believe the new class is small enough for this to be useful, in the sense that it will avoid all kernel crashes.

I rather expect useful or needed code would be rejected due to "not-sure-it-halts", and then people will use some kind of exception or not use the verifier at all, and then we are back to square one.

Lots of useful code is rejected due to "not-sure-it-halts". That's the premise.
Well it is useful in practice, there are some pretty useful products based on eBPF on Linux, most notably Cilium (and, shameless plug for the one I’m working on: Parca, an eBPF-based CPU profiler).
Bad wording on my part, and I still don't know how to word it better. I'm sure this thing is useful, I don't think everyone who contributed code was just clueless.

However, the claim "in the future, computers will not crash due to bad software updates, even those updates that involve kernel code" must be false. There is no way it is true. Whatever Cilium is, I cannot believe it generally prevents kernel crashes.

Correct, you will never be able to write any possible arbitrary code and have it run in eBPF. It necessarily constrains the class of programs you can write. But the constrained set is still quite useful and probably includes the crowdstrike agent.

Also, although this isn't the case now, it's possible to imagine that the verifier could be relaxed to allow a Turing-complete subset of C that supports infinite loops while still rejecting sources of UB/crashes like dereferencing an invalid pointer. I suspect from reading this post that that is the future Mr. Gregg has in mind.

> Whatever Cilium is, I cannot believe it generally prevents kernel crashes.

It doesn't magically prevent all kernel crashes from unrelated code. But what we can say is that Cilium itself can't crash the kernel unless there are bugs in the eBPF verifier.

If the verifier allowed a Turing-complete language, it would solve the halting probem, which is impossible.
The claim isn't that eBPF generally prevents kernel crashes. It's that it prevents crashes in the subset of programs it's designed for, in particular for instrumentation, which Crowdstrike is (in this author's conception) an instance of.
I have quoted the claim verbatim from the article. It is obviously the claim of the article.
eBPF is not Turing-complete, I suppose.
It is not, programs that are accepted are proved to terminate. Large and more complex programs are accepted by BPF as of now, which might give the impression that it's now Turing complete, when it is definitely not the case.
In this talk we demo Conway's Game of Life implemented in eBPF: https://www.youtube.com/watch?v=tClsqnZMN6I
I should clarify that individual eBPF programs have to terminate, but more complex problems can be solved with multiple eBPF programs, and can be "scheduled" indefinitely using BPF timers
If you’re wrong about the loop, you’ll still hit BPF_MAXINSNS, so it’s fine to use heuristics that could produce a false negative right?
Unterminated loops might be a better phrasing.