Hacker News new | ask | show | jobs
by klodolph 2258 days ago
This can only be done at compile time in very specific cases. The huge problem here is the compiler has no way of knowing which cases of undefined behavior are bugs in the program and which cases of undefined behavior are just examples of unreachable code. If the compiler aborted compilation when it detected undefined behavior, you’d be getting a lot of false positives for unreachable code, and you’d need to solve that problem (figuring out how to generate sensible errors and suppress them). This is not even remotely easy.

If you are concerned about safety there are ways to achieve that, like using MISRA C, formally verifying your C, or by writing another language like Rust.

2 comments

Good point, but could it not be required that the unreachable code would be annotated to be unreachable? It could even have a (development only) assertion in the location.
That would be an immense undertaking. It’s not really just that some statement or expression is unreachable (we have __builtin_unreachable() in GCC for stuff like that) but that certain states are unreachable.

For example,

    int buffer_len(struct buffer *buf) {
        return buf->end - buf->start;
    }
There are at least three states that trigger undefined behavior: buf is not a valid pointer, buf->end - buf->start doesn’t fit in int, and buf->end and buf->start don't point to the same object.

I’m not sure how you would annotate this. At the function call site, you would somehow need to show that buf is a valid pointer, and that start/end point to same object and the difference fits in an int. It would start looking more like Coq or Agda than C.

Honestly, I think if you really want this kind of safety, your options are to use formal methods or switch to a different language.

There’s also this weird assumption here that the compiler detects undefined behavior in your program and then mangles it. It’s really the opposite—the compiler assumes that there is no undefined behavior in your program, and optimizes accordingly. In practice you can turn optimizations off and get something much closer to the “machine model” of C (which doesn’t really exist anyway) but most people hate it because their code is too slow.

Thanks, so it's definitely easier said than done! Good explanation.
> If the compiler aborted compilation when it detected undefined behavior, you’d be getting a lot of false positives for unreachable code

Could you please provide an example of this?

Overflow of signed integers is undefined.

    int add(int a, int b) { return a + b; }
Unless the compiler can prove that `add` is never called with a and b values resulting in an overflow, this code can lead to UB, and, under your rules, the compilation aborts.