Hacker News new | ask | show | jobs
by chongli 1172 days ago
And UB has always been an excuse for compilers to screw over programmers in hideous ways

Your reply was great up until this. Compiler writers aren’t looking to screw over programmers, they’re looking to make code faster. UB gives them the ability to make assumptions about what is and is not true, at a particular moment in time, in order to skip doing unnecessary work at runtime.

By assuming that code is always on the happy path, you can cut a lot of corners and skip checks that would otherwise greatly slow down the code. Furthermore, these benefits can cascade into more and more optimizations. Sometimes you can have these large, complicated functions and call graphs get optimized down to a handful of inlined instructions. Sometimes the speedup can be so dramatic that the entire application is unusable without it!

Many of these optimizations would be impossible if compilers were forced to assume the opposite: that UB will occur whenever possible.

The tool programmers have available to them is compiler flags. You can use flags to turn off these assumptions, at the cost of losing out on optimizations, if your code needs it and you’re unable to fix it. But it’s better to turn on all possible warnings and treat warnings as errors, rather than ignoring them, to push yourself to fix the code.

1 comments

the thing that makes UB almost malicious is that it propagates inter-procedurally. This makes reasoning about code with UB basically impossible which means that you should always assume that the compiler is going to screw you over if you use it because there is no way to know whether it will.
You should consider a program with undefined behaviour to be the equivalent of a mathematical proof that contains an unstated contradiction. Ex falso quodlibet: from a falsehood anything follows. Also called the principle of explosion.

Undefined behaviour renders your entire program meaningless. It must be avoided at all costs. Using undefined behaviour on purpose is like sticking a fork in an electrical socket.

> Undefined behaviour renders your entire program meaningless

That's exactly the complaint. Consider that the implementations of the standard library sometimes have exposed UB: that renders behaviour of all of the running code on the system undefined.

Many programmers believe that the fallout of the UB could, and therefore should, be limited in scope.

To achieve your goal, compilers would have to disable any sufficiently powerful optimization. If you write bugs (UB), a powerful compiler will eventually catch them and generate code that you didn't intend at the beginning. However, this is not the fault of the compiler or the language.
Compiler writers have already done this. With flags you can disable any optimization you like, with all of the performance loss that entails. But then people complain that their programs are slow.

What people really want is an AI that ignores the code they write and just “does what they really meant.” But of course that’s not foolproof either. Every day people ask each other to do things and miscommunications occur, with the wrong thing being done. I don’t really know what to say other than “people should be more careful and also more forgiving.”

Exactly. All the hoopla about UB is complaining about how compiler optimizations work and the fact that the standard committee makes clear (with each new meeting) what is considered undefined behavior or not. They should instead thank the committee for clarifying this.
It is the fault of the language to the extent that the purpose of the language is to make it easy to write correct programs, and UB makes it really hard (and in some cases impossible) to write correct programs.
It is just the opposite. UB is a clarification to tell programmers what the language considers to be undesired behavior. If they didn't say anything, it would be always a mystery if a certain construct was allowed or not, effectively making it compiler dependent. Compilers would also have less avenue for creating optimizations. In the next iterations of the C standard we may see more constructs classified as UB.
It's funny that your original post was an objection to how undefined behavior gives license to screw developers over, but here you are talking about how undefined behavior is like sticking a fork in an electrical socket.
My original post was an objection to the implied intent on the part of compiler writers. An electrical socket does not have intent, it's just a hazard that also happens to provide enormous benefits to our lifestyles.

I think it's a perfect analogy to undefined behaviour in C: enormous benefits but also a hazard to be wary of. A lot of people don't understand the benefits, they just see the hazard. Throughout this discussion I've been trying to clarify that, with perhaps limited success.

But just to be clear @chongli is logical

Think of UB as a probabilistic error. I.e. it is always stupid to rely on it

1. Write code without errors -- sensible 2. Allow compilers to assume the absence of errors -- occasionally sensible, since it speeds up your program

In defence of UB, for the most part they are things that should break your program anyway: stack overflow is never correct. So your choice is mostly to fail badly quickly, or to fail slowly well

Thanks to google making the UB sanitizers you are free to make that choice even in C

I'd argue that it's stupid to think that it's stupid to rely on UB.

Almost any non-trivial software explicitly relies on undefined behavior, including safety critical libraries such as cryptographic libraries, the Linux operating system has rampant undefined behavior that it makes a conscious decision to use. POSIX makes use of undefined behavior for shared libraries (it treats functions loaded from shared libraries as void*, which is undefined behavior).

That's not an argument to keep live grenades laying around, it's an argument to remove them from the spec.

Like signed int being UB. Define it to have 2 complement semantics. Problem solved. I'm sure the nutters trying to extend C++ with templates will howl but this is C not C++. And seriously C++ is dead man walking at this point.

C23 does make two’s complement standard. It also adds checked arithmetic so you can safely avoid signed overflow.

It does not make signed overflow defined behaviour. This would prevent integer operation reordering as an optimization, leading to slower code.

>This would prevent integer operation reordering as an optimization, leading to slower code.

The sane way to address that is to add explicit opt-in annotations like 'restrict'.

  #push_optimize(assume_no_integer_overflow)
  int x = a + b;
  // more performance orientated code
  #pop_optimize
  // back to sane C

  #push_optimize(assume_no_alias(a, b), assume_stride(a, 16), assume_stride(b, 16))
  void compute(float *a, float *b, int index)
  {
   // here the compiler can assume a and b do not alias
   // and it can assume it can always load 16 bytes at a time
   // the programmer has made sure it's aligned and padded to so with any index
   // there's always 16 bytes to load
   // so go on, use any vectorized simd instruction you want
  }
  #pop_optimize
  // back to sane C
That’s a lot uglier and clunkier than just using the ckd_add, ckd_mul etc. safe checked arithmetic. Plus if an overflow occurs you still get an incorrect result which you probably don’t want.

Or maybe I’m wrong? Do people actually want overflows to occur and incorrect results? If they’re willing to tolerate incorrect results, why would they also want optimizations disabled?

Yeah but it's reversed signed overflow shouldn't be UB by default. You should have to explicitly opt in for that.

The reason of course why they refuse to do that if because if that were that case most shops would up and ban unsafe signed.

C++ 20 did that too.
Until LLVM, GCC, key game engines and GPGPU SDK get rewritten into something else, it is going to be Resident Evil day for a looong time.