Hacker News new | ask | show | jobs
by flohofwoe 32 days ago
> The examples are unequivocally UB. Full stop.

Tbh, already the first example (unaligned pointer access) is bogus and the C standard should be fixed (in the end the list of UB in the C standard is entirely "made up" and should be adapted to modern hardware, a lot of UB was important 30 years ago to allow optimizations on ancient CPUs, but a lot of those hardware restrictions are long gone).

In the end it's the CPU and not the compiler which decides whether an unaligned access is a problem or not. On most modern CPUs unaligned load/stores are no problem at all (not even a performance penalty unless you straddle a cache line). There's no point in restricting the entire C standard because of the behaviour of a few esoteric CPUs that are stuck in the past.

PS: we also need to stop with the "what if there is a CPU that..." discussions. The C standard should follow the current hardware, and not care about 40 year old CPUs or theoretical future CPU architectures. If esoteric CPUs need to be supported, compilers can do that with non-standard extensions.

4 comments

Not having unaligned access in the language allows the compiler to assume that, for basic types where the aligment is at least the size, if two addresses are different then they don't alias and writes to one can't change the result of reads from the other. That's a very useful assumption to be able to make for optimization - much more useful than yolocasting pointers in a way that could get you unaligned ones.
> if two addresses are different ...

Eh, if the compiler knows that two addresses are different at compile time, it also knows how big the difference is.

Usually this is not the case.
Indeed one of the fun LLVM bugs is that it can arrive at a situation in which it believes pointer A and pointer B are definitely not equal (weird given what's about to happen but OK that's potentially fine...) then we ask for their addresses† as integers X and Y, LLVM insists those integers aren't equal either because the pointers weren't (which as we're about to see is wrong) and then we subtract X - Y or Y - X and the answer either way is zero. Awkward. The integers were definitely equal.

† Although on a real modern CPU the pointer "is" just an address, notionally it has three components, the address, an address space (modern machines typically only have one) and a "provenance".

I agree. I meant to elaborate more on how to think of UB.

For most C software on x86_64, UB is "fine" with very strong bunny ears. But it is preferable for one to, shall we say, write UB intentionally rather than accidentally and unknowingly. Having an awareness of all the minefields lends for more respect for the dangers of C code, it makes one question literally everything, and that would hopefully result in more correct code, more often.

On that note, on some RISC-V cores unaligned access can turn a single load into hundreds of instructions.

I think the problem is just that C is under specified for what we expect a language to provide in the modern age. It is still a great language, but the edges are sharp.

How is UB fine? It can make your programme rather exploitable, depending on compiler. And the compiler might change her mind tomorrow.
Undefined means that the ISO C doesn't define the behavior. An implementation is free to do so.
If they do, that is no longer an implementation of C. It is a dialect of C, and there are many (GNU C being the most popular), but there are real drawbacks to using dialects.

This is in contrast to the other category that exists, which is "implementation-defined".

The thing is that the actual compiler behaviour matters more for real-world projects than what the C standard says. E.g. the C standard was always retroactive, it merely tried to reign in wildly different compiler behaviour at the time when the standard was new. It mostly succeeded, but still the most useful C and C++ compiler features are living in non-standard extensions.
Unaligned access being fine in one architecture, but not in others would create separate dialects, regardless of being blessed by ISO C.

Just don't do unaligned access, it's a dialect that doesn't exist currently, and should never exist.

> Unaligned access being fine in one architecture, but not in others would create separate dialects, regardless of being blessed by ISO C.

That doesn't mean unaligned access would need to be UB. It could be implementation defined. Or could just be defined to result in an error on all machines.

Implementation-defined without further constraining the behavior is not much better than undefined.

Defining it to always error would add overhead even for proper aligned access on x86, as the generated code would need to explicitly check in many cases.

> If they do, that is no longer an implementation of C.

This is plain wrong. Undefined behaviour, means the C standard specifies no restriction on the behaviour of the program, which is what the implementation chooses to emit. An implementation can very well choose to emit any program it pleases, including programs that encrypt your harddisk, but also programs that stick to well defined rules.

Sure, but the point is that code written against such a compiler is not C and is not portable. It is written in a dialect of C, and that comes with drawbacks.

Writing C (or any language) means adhering to the standard, because that's the definition of the language.

Maybe it’s a generation thing. Languages like ML and Lisp have many implementations, while newer languages like Perl and Python are steered by a single organization. It’s way easier for the latter to have a single source of truth.

The C standard reminds me of Posix. You have a rough guideline if you ever wanted to port a program, but you actually have to learn the new compiler and its actual behavior before doing so.

C started off as a single implementation, then a bunch of implementations, and only later the standard.

There's multiple implementations of Python, but you are right that CPython is the big one. Part of that dynamic is that CPython runs on nearly everything well enough (well, as good as Python runs), so that we don't really need multiple implementations. Unlike the first C Compilers.

Of course, Lisp is a family of languages these days. Scheme and Common Lisp have many implementations, but Racket or Arc only have one.

You can't make any useful software in "Portable C" - or any portable language for that matter.

Side effects matter, and they are always non-portable/implementation defined/dependent on the hardware.

What printf() actaully does is implementation defined - what does "printing mean", does a console even exist? Maybe a user expects it to show graphical ascii/utf8 glyphs on a LCD display? Well, not every computer has that, so now what?

> You can't make any useful software in "Portable C" - or any portable language for that matter.

Have you heard of Java or even Python or JavaScript?

> Side effects matter, and they are always non-portable/implementation defined/dependent on the hardware.

Granted. But how does the need for implementation defined excuse undefined behaviour?

> What printf() actaully does is implementation defined [...]

> Well, not every computer has that, so now what?

The standard can be written conditionally: 'if the computer has display, printf shall show something.'

I agree, that most practical programs will rely on unportable behaviour, but

> What printf() actaully does is implementation defined - what does "printing mean", does a console even exist? Maybe a user expects it to show graphical ascii/utf8 glyphs on a LCD display? Well, not every computer has that, so now what?

You can very well write a program, that doesn't make an assumption about any of those things. In fact you should, because the user is to be the arbiter of in what environment your program gets invoked and what it gets connected to. Writing a program that makes assumptions about the specific behaviour of stdout is going to be highly impractical and annoying and also violates the abstraction and interface that stdout is. This consideration isn't just valid for stdout, but also for any other interface your programs naturally interfaces with.

> Well, not every computer has that, so now what?

In the case stdout is not available or can't process your data it is going to return -1 and set errno and then you can deal with that.

There are still modern CPUs that don't support misaligned access. It would be insane for C to mandate that misaligned accesses are supported.

However I do agree that just saying "the behaviour is undefined" is an unhelpful cop-out. They could easily say something like "non-atomic misaligned accesses either succeed or trap" or something like that.

> In the end it's the CPU and not the compiler which decides whether an unaligned access is a problem or not.

Not just the CPU - memory decides as well. MMIO devices often don't support misaligned accesses.

> They could easily say something like "non-atomic misaligned accesses either succeed or trap" or something like that.

That means that the compiler must emit the read, even if the value is already known or never used, as it might trap. There is a reason for the UB!

No it doesn't. Compilers are only required to emit the read for volatile types. If the type is non-volatile, misaligned, and can be optimised out then it would be perfectly fine to omit it (that would be the "succeed" option).
If a trap is observable behaviour, then the compiler either needs to add code, that checks for the condition and then traps explicitly or it needs to actually perform the read. Currently it can be optimized out, because it is UB.
I think you misunderstood my suggestion. It isn't that misaligned accesses must either all succeed or all fail. That's not possible in general because of MMIO devices.

The suggestion is that each individual access must either succeed or trap. Those are the only possible outcomes, but different accesses can result in different outcomes.

And you misunderstand me. Your proposal means, that the compiler, must emit an individual access with the right values at the right time. If an access may succeed or fail, then the compiler can not just convert it into an aligned read, or not a read at all.

If your proposal includes, that a trapping access gets treated by the compiler as if it didn't and the compiler emits code that performs all kind of side-effects, that are logically independent if it would not trap, before the access, then you are back to undefined behaviour under a different name.

You're merely attacking his particular suggestion and using this as an argument to defend UB, when those are completely independent concerns.

What people want is for a compiler that assumes that all pointers are aligned to use an aligned store or load instruction whenever the compiler wants to issue such an instruction. There is no need for UB here.

In other words, they want the compiler to stick with the decision it made and not randomly say "I can't do the thing I've been doing correctly for decades, because that's UB, my hands are tied, I must ruin the code, there's no other way."

Yes, I am "attacking" his particular suggestion with the reasoning of this particular UB. I disagree that these concerns are independent, as the reasoning for the UB is often, that any choice per se would be limiting the possible compiler behaviour. It is not a particular choice, that would be limiting, but the act of picking one per se.

> What people want is for a compiler that assumes that all pointers are aligned to use an aligned store or load instruction whenever the compiler wants to issue such an instruction.

That requires a mental model of a compiler, that runs through the code linearly and emits instructions in the order defined by the code. That's not what is happening. Current compilers model the value flow through the code, and then emit a program that happen to output the same values for the valid input.

> In other words, they want the compiler to stick with the decision it made

What instead often happens is, that the compiler doesn't even emit a decision at all, because that is completely irrelevant.

On hardware that doesn't support it, misaligned loads could be compiled to multiple loads and shifts. Probably not great for performance, and it doesn't work if you need it to be atomic, but it isn't impossible.
That is only really possible if you know the pointer is misaligned at compile time (which does happen, e.g. for packed structs). The examples in the article are for runtime misalignment. It would be crazy to generate code so that every function checked if every access was aligned at runtime.

(Note the normal way to handle that if the hardware doesn't actually support it is for the access to trap and then the OS or firmware emulates it.)

That still requires detecting when a misaligned load happens.
For x86 SSE there are aligned instructions that will trap on unaligned access.