Hacker News new | ask | show | jobs
by tyfighter 840 days ago
I keep finding myself angry about the recent (some number of years) focus on C and C++'s undefined behavior. I have been writing C and C++ for 27 years, 16 years professionally, and despite all the scary implications, I do not understand why ANYONE cares. I do not get it. This is yet another article that goes on and on about nonsensical situations that are just shitty code. Integer overflow? Who cares? Unless you're targeting a specific compiler and architecture, it doesn't matter. C and C++ have footguns. Everyone knows that. Who cares?

I am anger commenting, because I'm just sick of this, but this article still says nothing to convince me that any of this matters.

5 comments

Right. To be clear, the purpose of the article is not 'zounds, C and C++ have footguns!' but 'C and C++ have footguns – yeah, this is not exactly breaking news – but here is a hopefully helpful summary of where they are, why they exist, and what you can do to avoid them'.

If you are already satisfied you know how to avoid them, and you don't need any more help with that, then you are not the target audience, and should by all means ignore the article.

But tyfighter has some reason. People take such articles, and use them to beat up anyone writing C++, arguing that they are stupid to use such an undependable tool.

So, yes, people who know what they're doing can ignore such articles, as a first-order effect. But there are second-order effects from such articles, and while they don't change anything, they are rather unpleasant. Hence tyfighter's anger - he gets tired of being on the receiving end of the fallout from such articles.

I do actually sympathize with that! I tried to keep a level tone, and maximize the ratio of useful information to flame-war ammunition, but that ratio unfortunately has an upper bound well short of infinity.
You haven't changed but compilers have changed, unfortunately. Unless you stick on -O0 or -fno-strict-aliasing all the time, the chance is that your UB-ridden code can break in the future with more powerful compilers exploiting more UBs. So that's why you have to care now if you didn't so far. (Or you can argue that optimizations should be turned off, which is indeed another valid, though uncommon, answer preferred by djb for example.)
Actually, I have changed, and I've changed corporate C/C++ MANY times to satisfy compiler upgrades. It is always just shitty code. It's never something insidious. Bugs happen. It probably wasn't your intention, but you've made UB sound pretty awesome.
If you are mainly dealing with shitty code, which frequently contains an UB, then most UBs you've encountered should have been from shitty code. That doesn't mean all UBs indicate shitty code.
> This is yet another article that goes on and on about nonsensical situations that are just shitty code.

> Who cares?

The short flippant answer is: because everyone writes shitty code at some point. It generally doesn't get committed or released, but during development, shitty buggy code with Undefined Behavior happens.

Here's a concrete example of some code that I actually wrote (simplified greatly so it could be a small illustrative example): https://godbolt.org/z/xzehrWE57

Knowing about UB is a useful way to describe what's going on in this code example, and why the compiler is doing what it's doing. If you see your code behaving in "impossible" ways, knowing about UB can give you some hints about where to look.

As an engineer, my job isn't to write code, it's to deliver systems that do specific things. That means that I need to understand the defined behavior of the code I put into the system. Undefined behavior anywhere means you lack defined behavior everywhere in C/C++.

You can't work around this by writing more code or eliminate undefined behavior with tools like linters and tests. Your one and only option is to write perfect code that only has defined behavior. The number of people that can accomplish this in practice rounds to zero.

So yeah, how can you not care about UB? It's the semantic elephant in the room. Every conversation has to include it, implicitly or not.

What universe are you working in where you think ANY of that is actually true? In the land of reality where I live and work (I work in hardware), I'm not constructing philosophical prose about well-defined systems. This is another bad faith argument where undefined behavior is made out to be some house of cards. I hate to break it to you, but every computer and all it's software you've ever used is a monument to the glory of undefined behavior, because people just didn't worry about it.
I'm exceedingly well-aware of how prevalent UB is and how "rarely" it actually turns into an issue in practice. The problem is that you have no way of knowing when or if a particular instance of UB will be dangerous. Even if you somehow know the impact today, that can change without warning in the future.

There's a wealth of studies on this subject, like this one [0] documenting cases where undefined behavior leads to miscompilations or examples like [1] where undefined behavior leads to security vulnerabilities. There's a quote from that second link that's deeply applicable here:

> This blog post provides an exploit technique demonstrating that treating these bugs as universally innocuous often leads to faulty evaluations of their relevance to security.

[0] http://dx.doi.org/10.1145/2517349.2522728

[1] https://googleprojectzero.blogspot.com/2023/01/exploiting-nu...

Aside from the contrived examples in the paper, the rest are bugs. The kernel exploit was a just a lack of a NULL check; another bug. Bugs are going to happen, and they're going to have unpredictable consequences. What does that have to do with the language and undefined behavior? These are all just more evidence of needing to know what you're doing if you're going to write code at this level, but really not because the vast majority of people aren't writing code where bugs in the form of crashes or security exploits will have serious consequences or can't be fixed.
I'm not sure what you're going for by trying to call the examples I linked bugs. Yes...?

The issue is that you can't solve these at the code level. The kernel vuln could have been solved by a null check only because the kernel build system explicitly tells the compiler not to omit null checks as a fix for earlier exploits [0] caused by the language allowing the compiler to omit null checks.

I don't think it's reasonable to brush these off as things that only affect "serious" code. For one, someone needs to write that important code and history has repeatedly demonstrated that even the best programmers write UB occasionally. Secondly, "important code" is pretty much the biggest remaining niche for large scale C development, and C++ to a lesser extent. Very few people are using Ada/SPARK for safety critical development, for example. Compilers have also become significantly more aggressive at optimizing against UB and security significantly more important, which means this problem is far worse than it was 30 years ago.

[0] https://lwn.net/Articles/342420/

UB is far from the only source of systems not doing the desired thing - writing code that ends up at UB is as wrong as writing code that was written with an incorrect understanding of the invoked behavior.

Sure, the neat trick of a+1<a not working is perhaps undesirable, but, even if signed addition was defined to wrap, in most contexts an "a+1" subtracting four billion is not gonna be the specific thing you want it to do in your system.

Alternatively, signed overflow could be defined to return exactly 31415, which would be very concrete defined behavior, but barely if at all more useful compared to it being UB.

I hope I didn't imply that UB was the only source of bugs. It obviously isn't. It's just the only source of bugs that has the side effect of undefining the semantics of all your other code.

Just for fun let's take your example and say signed overflow returns integer pi. That now means the compiler has to implement your (hypothetical) next line checking if the result is 31415 rather than omitting it under the assumption that it's unreachable because it would imply UB. All of that code suddenly has defined behavior, even if it's silly.

But what does it get you that it's a "defined but completely unusable value" versus "undefined"? Indexing an array by it, adding it to some previously-meaningful value, or doing anything else with it, is still gonna all do practically arbitrary things.

I suppose in some cases it can lead to bugs being harder to exploit, but it's still a bug and still wrong and still should be fixed. Being defined is not a get out of exploitability free card.

(ok I do have one case where "defined but completely arbitrary" is actually meaningful over "undefined" with no reasonable alternative in C - for a floating-point x, "x==(int)x" for checking if x exactly fits in an int - e.g. gcc on aarch64 or x86+AVX (requiring -fno-trapping-math for whatever reason) optimizes that to "x==floor(x)" as an fp-to-integer cast is undefined on overflowing result)

It means you could know what the code will do, that's it. Even that's useful though. It means you can write complete formal models of the language and apply them against your code. The current situation is that you can only build partial formal models, and the assumptions those models rely on evaporate in the presence of UB. It's a really shitty way to do proofs.

Not knowing what the code will do also means that most of the safety critical code in your life is verified through a checkbox that essentially says "I promise there's no undefined behavior". For example, here's what MISRA says about undefined behavior:

    Rule 1.3: There shall be no occurrence of undefined or critical unspecified behaviour

    Analysis: Undecidable, System
It'd be nice to have at least the potential to analyze the code both as one of the people writing safety-critical code and a person who uses cars, planes, trains, etc.
You can absolutely write formal models with the presence of UB - encountering UB is just a call to do_anything(), and the scenarios in which UB happens is itself well-defined. Determining whether any UB can happen is as "undecidable" as determining whether the program follows a given specification - undecidable in the general case, but likely decidable for most specific cases.

Time travel may feel a little funky as you end up not being able to ensure anything leading up to UB happened, but that might not matter much - even if you have "shut_down_engines(); UB();" and are afraid of engines not ever getting shut down, the UB could equivalently also just run start_engines_back_up(), or even without UB some later code sees your off-by-four-billion number and thinks it really needs to (though yes you could have some truly-supposed-to-be-irreversible actions).

I'm pretty sure engineers expected to follow "there shall be no occurrence of UB" are also expected to follow "there shall be no occurrence of behavior we didn't ask you to write" in general - in a car/plane/train integer overflow is likely gonna result in some pretty undesirable behavior regardless of whether that's because the compiler messed with it or because now all your calculations are off by four billion. (and sometimes the compiler can even optimize based on UB to some more desirable code, e.g. "x-y<0" to "x<y" for signed integers, or expanding the range of lengths a loop works on by promoting the index variable)

And you do have UB sanitizers (and perhaps it'd be neat to have compilers have an option to define as much as is reasonable for absolutely critical software that for whatever reason was written in C).

And you cannot even meaningfully have an equivalent to sanitizers on defined operations - if an operation is explicitly defined, people may rely on it, and therefore it is unacceptable to ever warn on it! (ok rust does do a funky thing of making integer overflow trap on debug builds, and be defined to wrap on release ones, but to me this does not seem like a reasonable approach to have on many things)

Because behavior does eventually get defined somewhere. Just because it's not defined in the C standard it does not mean you can't reason about it.
No, if it was defined somewhere, it'd have a consistent behavior and it wouldn't "time-travel" the way UB can. The word for this in the standards is unspecified behavior. Undefined behavior doesn't need to have any requirements. Different parts of the toolchain and runtime environment (or even different compiler passes) may assume different behaviors for the construct. Even different calls to the same function with the same arguments may produce different behaviors.

Let's walk through a simple example to make this clear. Let's assume you have a macro function foo() that triggers some trivial UB, perhaps integer overflow. Let's also say that this macro function is called the same way in two different translation units. Because there are no requirements on UB by definition, there's no guarantee that those calls will do the same thing, even on the same runtime, using the same compiler, with the same flags. Even the same line of code calling the same arguments may see different things every time, because again there are no required behaviors.

Even code that does not itself trigger UB, but is on an execution path with UB does not have a defined behavior and will commonly be omitted by optimizing compilers like GCC. This has resulted in Linux vulnerabilities where null pointer checks were omitted from the actual binary because other code was "proven" by the compiler to dereference the pointer first.

>Because there are no requirements on UB by definition, there's no guarantee that those calls will do the same thing, even on the same runtime, using the same compiler, with the same flags.

Reread my comment. You are talking about behavior not defined by the C standard which I addressed in that comment. Compilers are deterministic. Reproducible builds are a thing.

Reproducibility is an entirely unrelated issue. The same compiler can produce different assembly for the same code depending on the surrounding context, or any number of other reasons. A reproducible build just means that you'll get the same binary each time you build it. Furthermore, the same generated assembly can produce different results each time it's run, as data races do. In that case, the only "definition" comes down to the essentially unknowable physical state of the system.
>Reproducibility is an entirely unrelated issue.

No, reproducibility is about having a defined output for a given source code and toolchain.

Yeah, no. Yes, in theory undefined behavior can destroy your entire program. In practice? Not so much.

I do not care about bogeymen that exist in theory. I don't even care about bogeymen that affect your code. I only care about bogeymen that actually affect my code.

As a user, I do care when people who declare that UB is not a problem because "you just have to write good code" still end up repeatedly shipping apps and libraries with vulnerabilities in them. Which with C and C++ specifically happens all the time, and much more often than in languages with significantly less UB. The proof is in the pudding.
>"Undefined behavior anywhere means you lack defined behavior everywhere in C/C++."

Well, stop programming then. Undefined behavior is everywhere. Your hardware, CPU microcode, any software written in any language etc. etc.

>"As an engineer"

Your statements suggest otherwise.

Overall I agree with you. But the people writing the C++ standard library have to care.