Hacker News new | ask | show | jobs
by ordu 655 days ago
Maybe I don't understand something, but for me it seems pretty easy. What is needed to be done:

1. Make a list of all UB

2. Define the sensible compiler behavior in each case (for example, let MAX_INT+1 to calculate into MIN_INT on x86_64, just because `add` on x86_64 does that)

3. Treat this as a part of a standard, when compiling the code.

This approach allows to have different compiler behavior on different architectures, which are better suited for the architecture. Maybe on some architectures `add` on signed numbers will generate a CPU exception on overflow, so define this as a way to behave and go with it.

2 comments

The requirement for “sensible” (i.e. repeatable) behavior breaks many simple, critical optimizations like maintaining the referent of a nominally un-aliased pointer in a register.

What if there’s UB & it is aliased? Some other pointer of a different type in scope also references the same value. The “sensible” thing to do when the value is updated through the alias is…?

That works for a lot of behavior but not everything. For example:

  int f(int x) {
    static int y[] = {42, 43};
    return y[x];
  }
What behavior should `f(-1)` or `f(100)` have? What is sensible?
Desugar to pointer arithmetic, try to do an dereference like

    *(y-1)
and more than likely segfault, or return the value at that address if it's somehow valid.