| Awesome writeup. Always interesting to read what Daniel has to say. I think the fact that it turned out that he was wrong (and UBsan was right, as usual) is a great testament to the shortcomings of C. Lots of people - both inexperienced and very experienced - celebrate it for being "simple" and "close to the hardware", but the truth of the matter is that it is precisely not close enough to the hardware for people who _know_ what the hardware is doing to be able to do what they expect, and it's too close to the hardware to be able to be able to ignore it. Lots of experienced C programmers (and - guilt by association - C++ programmers as well) run into UB because they have clear expectations of the compiler. I.e., they know what the compiler should generate, more or less, and C is just a convenient notation. But compilers don't live up to those expectations, because they don't actually compile your code for the hardware. They compile it to the virtual machine abstraction defined by the standard, which very often works differently from any real architecture, and then translate that into machine code. Even though there is basically a single set of semantics that every single "relevant" (mainstream) architecture implements. This is a holdover from when C had to target architectures that are 100% irrelevant today. Everybody's favorite example is signed integer overflow. In both x86-64 and ARM64, that just works - two's complement is the only relevant implementation, so there's no issue. But `int` in C and C++ is not that. Almost every single common UB pitfall has reasonable behavior at the assembler level for every mainstream architecture, and almost every single niche architecture. C gives you the illusion of being close to the hardware, but in actual reality the hardware is several steps removed, so if you want to leverage your knowledge of the hardware, calling conventions, assembly, or other low-level details, you have to go out of your way to work around the C standard. (Aside: We need new languages to tackle this, and I coincidentally happen to like Rust. Lots of people coming from C or C++ are irritated and frustrated by Rust, but 99% of the time it's because Rust gives you a compile error where C would give you UB. This is one example of that out of thousands.) |
Many folks will complain about anything that breaks their beloved illusion that C is a fancier macro assembler, except even proper macro assemblers have less UB than regular C.