|
Yes there is tons of surprising and weird UB in C, but this article doesn't do a great job of showcasing it. It barely scratches the surface. Here's a way weirder example: volatile int x = 5;
printf("%d in hex is 0x%x.\n", x, x);
This is totally fine if x is just an int, but the volatile makes it UB. Why? 5.1.2.4.1 says any volatile access - including just reading it - is a side effect. 6.5.1.2 says that unsequenced side effects on the same scalar object (in this case, x) are UB. 6.5.3.3.8 tells us that the evaluations of function arguments are indeterminately sequenced w.r.t. each other.So in common parlance, a "data race" is any concurrent accesses to the same object from different threads, at least one of which is a write. In C, we can have a data race on a single thread and without any writes! |
> It barely scratches the surface.
I agree. The point of the post is not to enumerate and explain the implications of all 283 uses of the word "undefined" in the standard. Nor enumerate all the things that are undefined by omission.
The point of the post is to say it's not possible to avoid them. Or at least, no human since the invention of C in 1972 has.
And if it's not succeeded for 54 years, "try harder", or "just never make a mistake", is at least not the solution.
The (one!) exploitable flaw found by Mythos in OpenBSD was an impressive endorsement of the OpenBSD developers, and yet as the post says, I pointed it at the simplest of their code and found a heap of UB.
Now, is it exploitable that `find` also reads the uninitialized auto variable `status` (UB) from a `waitpid(&status)` before checking if `waitpid()` returned error? (not reported) I can't imagine an architecture or compiler where it would be, no.
FTA:
> The following is not an attempt at enumerating all the UB in the world. It’s merely making the case that UB is everywhere, and if nobody can do it right, how is it even fair to blame the programmer? My point is that ALL nontrivial C and C++ code has UB.