Hacker News new | ask | show | jobs
by pcwalton 2540 days ago
This is a common misconception. Compiler authors don't exploit undefined behavior to make themselves seem smart, or because they like breaking code. They exploit undefined behavior because somebody filed a bug saying some code was slow, and exploiting UB was the simplest way--or, in many cases, the only way--to fix the performance problem.

GCC and Clang do give you the option to avoid optimizations based on undefined behavior: compile at -O0. We think of the low-level nature of C as being good for optimization, but in many cases the C language as people expect it to work is at odds with fast code.

It's fascinating to actually dive into the specific instances of undefined behavior exploitation that get the most complaints. In each such case, there is virtually always a good reason for it. For example, treating signed overflow of integers as UB is important to avoid polluting perfectly ordinary loops with movsx instructions everywhere on x86-64. It's easy to see why compiler developers added these optimizations: someone filed a bug saying "hey, why is my loop full of movsx", and the developers fixed the problem.

Edit: Should be movsx instead of movzx, sorry.

2 comments

Could you go into a little bit more detail regarding the movzx? Aren't 32-bit registers always zero-extended on x86-64?
Sure. Here's an in-depth explanation from Fabian Giesen: https://gist.github.com/rygorous/e0f055bfb74e3d5f0af20690759...
Thanks, rygorous is always a great read - although sometimes a little overwhelming. If I got the gist of it, I have a small correction to your comment: the issue is about movsxd (sign extended integer indexes), not movzx (zero extension).
It's easy to see why compiler developers added these optimizations: someone filed a bug saying "hey, why is my loop full of movsx", and the developers fixed the problem.

"fixed" by breaking other expectations. Regardless of what the spec says, that's still a stupid way to do things. There's a child comment below which examines this case in detail; and the real solution is to make the analysis better, not use UB as a catch-all excuse.