Hacker News new | ask | show | jobs
by ithkuil 2956 days ago
Yep the correct way is:

union { int i; float f;} u; u.i=*i;

Now u.f contains the same bit pattern as the integer pointed to by "i" but interpreted as a float.

Compilers know how to generate code efficiently out of this.

Iirc this is implementation defined behavior, e.g. things like endianness alignment or padding, but it's not undefined behavior, whose evil effects can bubble up in unexpected parts of the program that apparently don't even touch the bad cast.

4 comments

No, that's also incorrect. The correct way is:

    float f;
    memcpy(&f, i, sizeof f);
Don't worry about the implied overhead of the call to memcpy; the compiler will replace it with an intrinsic and emit the expected movss instruction.
Accessing through a union is legal in C, no? I thought that this was only disallowed in C++.
Maybe? There is wording in the C standard (since C03 TC3 I think) that allow reading accessing from non active members of the union, but the wording is unclear and implementations have taken a very narrow interpretation of it (i.e. the access path need to be explicitly through an union type and the actual object needs to be an union). See the GCC documentation on union and aliasing for example.
It's explicitly legal in both C99 and C11, but C99 forgot to remove it from its master (non-normative) list of undefined behavior. It's up for debate whether or not it's legal in C++ (ISTR a C++0x draft that copied the C99 changes to unions, but then they added unrestricted unions and completely changed the verbiage to have almost no relation to the C99 text).
It mostly isn't legal, though gcc explicitly promises to support it and Microsoft's compiler doesn't even do the optimizations that would cause trouble.
Wow, never realized the pointer cast version is undefined behaviour in C. Is C++ equivalent via reinterpret_cast undefined behaviour too?

Back in my C++ days, I would much prefer pointer-cast to union-struct, because the latter was subject to alignment/padding issues you mention, while for the former, I know of no desktop architecture that could mess up a int* -> float* cast.

I'm also doubly surprised because being able to do that stuff is what I'd consider the selling point of C/C++ over higher-level languages.

Pretty much all pointer casts where you would use reinterpret_cast in C++ are undefined behavior, in both C and C++. The underlying rule for strict aliasing is essentially the same in both languages (you can only access an lvalue via a pointer of its type, with possibly differing signedness, const, volatile, or via a char pointer, or via a field of a struct that has the same initial layout if you're in that same initial layout portion).

Yeah, C and C++ aren't the language most people think they are, and the strict aliasing rules are the most clear manifestation of that.

Trouble is, C was that language. Lots of people came to depend on this. The world needs a language with this ability.

The replacement is often "gcc with inline assembly" or "gcc with lots of __attribute__ extensions such as __may_alias__" or "gcc with optimizations disabled".

The need doesn't go away, despite the wishes of a language committee now dominated by compiler suppliers. From day 1, long before the language was standardized and even before the original (not ANSI) book was published, C was being used with aliasing. It was, after all, developed to support the creation of Unics (before that became UNIX).

reinterpret_cast in C++ has essentially the same rules, with exceptions made for const qualifiers and such, as well as void , char , and std::byte *. Use a memcpy instead.
Nope, it's undefined:

> Accessing an object by means of any other lvalue expression (other than unsigned char) is undefined behavior

https://wiki.sei.cmu.edu/confluence/display/c/EXP39-C.+Do+no...

I don't understand your comment. The link you pasted explicitly mentions the union example as a "compliant solution".

That said, memcpy and a compiler that understand the memcpy intrinsic is probably safer; that said I have a vague memory of hitting an embedded compiler where memcpy wasn't an intrinsic

This is technically undefined in C, but most compilers define it to work as expected as an extension to the language.