| The arguments in this blogpost are fundamentally flawed. The fact that they opened a bug based on them but got shut down should have raised all red flags. When compiling and running a C program, the only thing that matters is "what the C abstract machine does". Programs that exhibit UB in the abstract machine are allowed to do "anything". Trying to scope that down using arguments of the form "but what the hardware does is X" are fundamentally flawed, because anything means anything, and what the hardware does doesn't change that, and therefore it doesn't matter. This blogpost "What The Hardware Does is not What Your Program Does" explains this in more detail and with more examples. https://www.ralfj.de/blog/2019/07/14/uninit.html |
I think it's also worth considering WHY compilers (and the C standard) make these kinds of assumptions. For starters, not all hardware platforms allow unaligned accesses at all. Even on x86 where it's supported, you want to avoid doing unaligned reads at all costs because they're up to 2x slower than aligned accesses. God forbid you try to use unaligned atomics, because while technically supported by x86 they're 200x slower than using the LOCK prefix with an aligned read.[^1] The fact that you need to go through escape hatches to get the compiler to generate code to do unaligned loads and stores is a good thing, because it helps prevent people from writing code with mysterious slowdowns.
Writing a function that takes two pointers of the same type already has to pessimize loads and stores on the assumption that the pointers could alias. That is to say, if your function takes int p, int q then doing a store to p requires reloading q, because p and q could point to the same thing. Thankfully in some situations the compiler can figure out that in a certain context p and q have different addresses and therefore can't alias, this helps the compiler generate faster code (by avoiding redundant loads). If p and q are allowed to alias even when they have different addresses, this would all go out the window and you'd basically need to assume that all pointer types could alias under any situation. This would be TERRIBLE for performance.
[^1]: https://rigtorp.se/split-locks/