Hacker News new | ask | show | jobs
by wrs 849 days ago
Cross-compiling is irrelevant, because if the behavior is processor-specific then the cross-compiler knows the behavior. Constant folding shouldn’t happen in this case because x is not constant. If you choose a saturating add consistently on this processor target, and document it, that’s fine.

Deleting dead code because the code demonstrably wouldn’t do anything (that is, it has defined behavior that is not observable) makes sense and is of course hugely useful. Deleting code that isn’t dead, it just doesn’t have a universally defined behavior, is the issue.

1 comments

> Deleting code that isn’t dead, it just doesn’t have a universally defined behavior, is the issue.

Can I delete "if (x & 3 == 16)" without a warning? There is no 'x' which makes that expression true, so I can safely fold it to false without a warning?

Can I delete "if (x + 1 < x)" without a warning? There is no signed 'x' which makes that expression true, so I can safely fold it to false without a warning?

How about this:

    int x = 7;
    call_function_outside_this_file();
    if (x != 7) { /* dead */ }
Does deleting the code require a warning or no?

Or this:

    void f(int *x, float *y) {
      *x = 1;
      *y = 2;
      if (*x != 1) { /* dead */ }
A float cannot alias an int, so '*x' can not have changed. Warning or no?

The problem with UB is that you can use it to set up impossible situations, like create an 'x' where x & 3 == 16 is true or a variable whose address was never taken being modified through a pointer, and so on. If you account for UB then "code that doesn't have a universally defined behaviour" becomes all code.

Ideally I think the first two examples should have warnings, though not because we delete the code, and the last two shouldn't? The warning should be because it's a tautology so the human likely didn't mean to write that (for instance if the human wrote it indirectly through macros, then we shouldn't warn on it).

You’re on to something with that last example. The idea that those two pointers can’t alias is one place C has diverged from my understanding. Of course they can alias. Which is why I wouldn’t naturally write that code, I’d write:

    void f(int *px, float *py) {
        int x = *px;
        x = 1;
        *y = 2;
        if (x != 1) { /* dead */ }
        *px = x;
If I put in a dereference, I expect a dereference to happen. Not dereferencing the pointer when I wrote a dereference operator seems like going too far. If they aren’t supposed to alias, but they did anyway, the code should do the wrong thing, in a way that makes sense based on the code I wrote.

I’m obviously just a holdover from the 90s, but it does seem we’ve leaned too far into hidden assumptions that the compiler thinks I share, rather than doing what the code says, or a simplification of what the code says.

> If I put in a dereference, I expect a dereference to happen. Not dereferencing the pointer when I wrote a dereference operator seems like going too far.

Surely not? I mean, you probably didn't intend to include unevaluated contexts like "sizeof(ptr)" where putting in a memory access is forbidden, but I think nearly-all programmers fully expect the compiler to delete the dead store in "ptr = a; ptr = b;" or "ptr = x; free(ptr);" and would get annoyed if it didn't. Especially if we can't just take a scalar computation in a loop, move the memory access to register, then store it to memory only once when we're done.

I once did a cleanup of undefined behaviour dereferencing NULL pointers (-fsanitize=null) and I got a lot of pushback from people complaining about "&ptr" where the ptr is NULL, because the compiler doesn't emit any assembly for that, so their code is just fine as is.

The rule for memory is that all memory you've stored to has an effective-type -- same as the static types but for addresses at runtime -- and a pointer has to point to an object with the effective-type matching the pointer's static type. Further details aside (uninitialized pointers, pointers to data you just freed, freshly malloc'd memory which has no effective type yet, unions) when you think of it in this model, the fact you can't have an int and float* pointing to the memory feels natural.