Hacker News new | ask | show | jobs
by hedora 1033 days ago
Honestly, I think you are both incorrect.

C has always had a concept of implementation defined behavior, and unaligned memory accesses used to be defined to work correctly on x86.

Intel added instructions that can’t handle unaligned access, so they broke that contract. I’d argue that it is an instruction set architecture bug.

Alternatively, Intel could argue that compilers shouldn’t emit vector instructions unless they can statically prove the pointer is aligned. That’s not feasible in general for languages like C/C++, so that’s a pretty weak defense of having the processor pay the overhead of supporting unaligned access on some, but not all, paths.

5 comments

> C has always had a concept of implementation defined behavior, and unaligned memory accesses used to be defined to work correctly on x86.

There are a bunch of misconceptions here:

- unaligned loads were never implementation defined, they are undefined;

- even if they were implementation defined, this would give the compiler the choice of how to define them, not the instruction set;

- unaligned memory accesses on x86 for non-vector registers still work fine, so old instructions were not impacted and there's no bug. It's just that the expectations were not fulfilled for the new extension of those instructions.

Note: SIMD on x86 has unaligned instructions that used to be much slower (decoded differently) than their aligned counterparts.

For example, on Pentium 3 and Pentium Core 2, the unaligned instructions took twice as many cycles to execute. On modern x86 family processors, it’s the same cycle count either way. The only perf penalty one should account for is crossing of cache lines, generally a much smaller problem.

Here's a link to the final C89 draft spec (the ratified spec is paywalled):

https://www.open-std.org/JTC1/sc22/wg14/www/docs/n1256.pdf

From section 6.7.2.1, semantics #10:

> The alignment of the addressable storage unit is unspecified.

This is for struct field access, but it clearly implies the compiler can choose to use unaligned struct fields. Also, the size of the integer types are all implementation defined:

then #12:

> Each non-bit-field member of a structure or union object is aligned in an implementation- defined manner appropriate to its type.

Alignment is defined as:

> requirement that objects of a particular type be located on storage boundaries with addresses that are particular multiples of a byte address

It doesn't say which multiple. 1 is a multiple. (So is 0.5, just in case the complier wants to go nuts with arcane code gen.) The spec even allows chars to be 7 bits. I didn't bother looking up the definition of byte in the spec for those architectures. (7 bits? 8 bits?)

In section 6.2.5, they talk about implementation-defined restrictions on integer types + alignment requirements:

> For each of the signed integer types, there is a corresponding (but different) unsigned integer type (designated with the keyword unsigned) that uses the same amount of storage (including sign information) and has the same alignment requirements.

So, the alignment of integers has to be the same for signed + unsigned types. That still doesn't say byte-aligned integers are disallowed.

Later:

> An integer may be converted to any pointer type. Except as previously specified, the result is implementation-defined, might not be correctly aligned,

Again, the alignment behavior is clearly implementation-defined.

I can't find a definition of implementation in the spec, but it clearly includes the compiler, standard library, and operating system. There is this quote:

> For implementations with a signed zero (including all IEC60559 implementations)

which, according to the IEC60559 abstract "An implementation of a floating-point system conforming to this standard may be realized entirely in software, entirely in hardware, or in any combination of software and hardware." I doubt they were trying to constrain floating point to be done in software by compilers, so it's pretty clear they intended to incorporate the physical hardware in the definition of the "implementation".

Later, they say:

> ...is defined if and only if the implementation supports the floating-point exception

which was definitely in the realm of hardware support back in 1989. Some later sections says that some macros (such as for FMA) are defined iff the implementation implements the primitive in hardware, and not just software.

Undefined and implementation defined are different in C. The number of bits in an int is implementation defined. Unaligned access is undefined.
See sibling comment. The alignment requirements are implementation defined, and any multiple is legal. 1 byte is definitely a legal multiple.
Unaligned memory access is UB period, no ifs no buts.
Loads of architectures can't do misaligned memory access. Even x86 has problems when variables span cache lines. The compiler usually deals with this for the programmer, e.g. by rounding the address down then doing multiple operations and splicing the result together.
Most modern architectures that target high performance implementations can do unaligned accesses, even ones crossing page boundaries.

Less common is support for atomic RMW access to unaligned location. x86 does support it but crossing a cache line causes the operation to be very slow.

Unaligned memory accesses are undefined behavior in C. If you're writing C, you should be abiding by C rules. "Used to work correctly" is more guesswork and ignorance than "abiding by C rules". In C, playing fast&loose with definitions hurts, BAD.

Frankly, I'd be ashamed to write this blog post since the only thing it accomplishes is exposing its writers as not understanding the very thing they're signaling expertise on.

What makes you think they don't understand it? They acknowledge that it is UB. I read them as realistic, since they know that people rely on C compilers working a certain way. They even wrote an interpreter that detects UB: https://github.com/TrustInSoft/tis-interpreter

I understand why people like the compiler being able to leverage UB. I suspect this philosophy actually makes Trust-In-Soft more money: You could argue that if there was no UB, there would be no need for the tis-interpreter.

So isn't it in fact quite self-less that they encourage the world to optimize a bit less (spending more money on 'compute'), while standing to profit from the unintended behaviour they'd otherwise be contracted to help debug?

I made a comment a few levels up to a sibling where I point out the parts of the C89 spec that are relevant.

Alignment requirements for integers are implementation defined, not undefined behavior. On x86, the implementation used to define the alignment requirement to be one byte.

In fact, if you've do enought hardware register and bus-level (e.g., PCIe) programming, you'll quickly realize that there are all sorts of other exotic implementation-defined alignment constraints on modern systems.

Pretty much everything you wrote in that comment is wrong since you're interpreting the spec in a way that's clearly not what the spec describes (e.g. the spec is talking about alignment requirements for conversions, but you generalize it to "alignment requirements" which is dead wrong).
> C has always had a concept of implementation defined behavior,

Surely only after standardization tho?