Hacker News new | ask | show | jobs
by dahart 537 days ago
This code is technically UB in C++, right? [1] Has anyone run into a case where it actually didn’t work? Just curious. I’ve often assumed that if C++ compilers didn’t compile this code, all hell would break loose.

It might be nice to start sharing modern/safe versions of this snippet & the Quake thing. Is using memcpy the only option that is safe in both C and C++? That always felt really awkward to me.

[1] https://tttapa.github.io/Pages/Programming/Cpp/Practices/typ...

6 comments

Yes, most casts via pointers are formally UB in both C and C++, and you can induce weird behavior by compilers if you transmit casted pointers through a function boundary (see [0] for a standard example: notice how the write to *pf vanishes into thin air). Since people like doing it in practice, GCC and Clang have an -fno-strict-aliasing flag to disable this optimization, and the MSVC compiler doesn't use it in the first place (except for restrict pointers in C). They don't go too far with it regardless, since lots of code using the POSIX socket API casts around struct pointers as a matter of course.

Apart from memcpy(), the 'allowed' methods include unions in C (writing to one member and reading from another), and bit_cast<T>() and std::start_lifetime_as<T>() in C++.

[0] https://godbolt.org/z/dxMMfazoq

There are two additional ways of making this work.

The first is to allocate the memory using char a[sizeof(float)]. In C, char pointers may alias anything, so then you can do pointer conversions that would normally be undefined behavior and it should work. The other option is to use the non-standard __attribute__((__may_alias__)) on the pointer.

By the way, using union types for this is technically undefined behavior in the C and C++ standards, but GCC and Clang decided to make it defined as an implementation choice. Other compilers might not.

The union trick is actually defined in C.

And note that while char can alias anything, the reverse is not true: i.e. you can't generally cast a char array to anything else and expect sensible behaviour. There are ways to make this work (placement new in C++ for example), but it is not a way to escape TBAA: if you store a float in char array you can't then cast it to int with impunity.

To be more precise, it is defined since c99[0]. In c89 it was undefined, but type punning is the most used/sensible behaviour, so they changed it in c99.

[0]: https://en.cppreference.com/w/c/language/union

That is a common misconception. DR 283 is a suggestion for an amendment that was filed 3 years after C99 was published:

https://open-std.org/jtc1/sc22/wg14/www/docs/dr_283.htm

It is not part of C99. It also is not part of the C standard since no subsequent C standard adopted it according to the GCC developers:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118141#c13

A read of the C11 standard draft, which would have this amendment if it were accepted by the C standards committee, shows that this has not been added:

https://open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf

Type punning via union types is therefore undefined behavior unless your compiler implements an extension to define it like GCC and Clang do.

Hum, from your wg14 link: 6.5.2.3 comma 3 and note 95. I thought that was the note that was added on TC3.

Also the note is non-normative, so it is only clarifying preexisting behaviour.

But I'm far from an expert on the C standard. Also that was the C11 draft, maybe the note was removed before the final standard.

Edit: I believe the alias rules are in 6.5 comma 7; specifically:

> An object shall have its stored value accessed only by an lvalue expression that has one of the following types:

[...]

>an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union),

Edit2: neither commas nor the note have changed in the 202y Draft.

Oh that's interesting. I guess I should actually look at the standard instead of taking cppreference's word for it next time
Yes, it 2025, I thought that we could at least imply C99 when talking about plain C :).

I'm probably an optimist.

It does not matter. The C99 standard does not define this behavior:

https://news.ycombinator.com/item?id=42568271m

For C++, `bit_cast<uint32_t>(0.f)` should be Well Defined, right? I'm curious, in C, is union-casting float->uint32_t also Perfectly Legal And Well Defined?

(I am not a C or C++ expert.)

It should be well-defined with reinterpret_cast<uint32_t&> though.
No, reinterpret_cast doesn't change the type of the underlying object. The rule in [basic.lval]/11 (against accessing casted values) applies to all glvalues, whether they come from pointers or references.
It would be better if compilers were fixed. Current treatment of UB is insane. The default should be no UB and a pragma that lets you turn it on for specific parts. That used to be the case and some of the world's most performant software was written for compilers that didn't assume no pointer aliasing nor no integer overflow (case in point: Quake).

The big problem, apart from "memcpy is a bit tedious to write", is that it is impossible to guarantee absence of UB in a large program. You will get inlined functions from random headers and suddenly someone somewhere is "dereferencing" a pointer by storing it as a reference, and then your null pointer checks are deleted. Or an integer overflows and your "i>=0" array bounds assert is deleted. I have seen that happen several times and each time the various bits of code all look innocent enough until you sit down and thoroughly read each function and the functions it calls and the functions they call. So we compile with the worst UB turned off (null is a valid address, integers can overflow, pointers can alias, enums can store any integer) and honestly, we cannot see a measurable difference in performance.

UB as it is currently implemented is simply an enormous footgun. It would be a lot more useful if there were a profiler that could find parts of the code, which would benefit from UB optimisations. Then we could focus on making those parts UB-safe, add asserts for the debug/testing builds, and turn on more optimisations for them. I am quite certain nobody can write a fully UB-free program. For example, did you know that multiplying two "unsigned short" variables is UB? Can you guarantee that across all the template instantiations for custom vector types you've written you never multiply unsigned shorts? I'll leave it as an exercise to the reader as to why that is.

Strict aliasing in particular is very dependent on coding style; I too just turn it off with essentially no harm, but certain styles of code that don't involve carefully manually CSEing everything might benefit a good bit (though with it being purely type-based imo it's not that useful).

Outside of freestanding/kernel scenarios, null treatment shouldn't affect anything, and again can similarly benefit probably a decent amount of code by removing what is unquestionably just dead code. There's the question of pointer addition resulting in/having an argument of null being UB, which is rather bad for offsetof shenanigans or misusing pointers as arbitrary data, but that's a question of provenance, not anything about null.

Global integer overflow UB is probably the most questionable thing in that list, and I'd quite prefer having separate wrap-on-overflow and UB-on-overflow types (allowing having unsigned UB-on-overflow too). That said, even having had multiple cases of code that could hit integer overflow and thus UB, I don't recall any of them actually resulting in the compiler breaking things (granted, I know not to write a+1<a & similar for signed ints), whereas I have had a case where the compiler "fixed" the code, turning a `a-b < 0` into the more correct (at least in the scenario in question) `a<b`!

I do think that it would make sense to have an easy uniform way to choose some specific behavior for most kinds of UB.

There is no point in talking with the people that worship UB for the sake of UB.

They don't understand that one of the biggest barriers to developers writing and adopting more C in their projects is the random jankiness that you get from the compilers. Instead they make C this elite thing for the few people who have read every single line of C code they themselves compiled and ran on their Gentoo installation. Stuff like having no bounds checks is almost entirely irrelevant outside of compute kernels. It doesn't get you much performance, because the branches are perfectly predictable. It merely reduces code size.

There is also the problem that the C development culture is uttlery backwards compared even to the semiconductor industry. If you want to have these ultra optimized release builds, then your development builds must scream when they encounter UB and it also means that no C program or library without an extensive test suite should be allowed onto any Linux distribution's package repository. Suddenly the cost of C programming appears to be unaffordably high.

> Stuff like having no bounds checks is almost entirely irrelevant outside of compute kernels. It doesn't get you much performance, because the branches are perfectly predictable. It merely reduces code size.

This is just not true, or languages with bounds checks wouldn't invest so much in bounds checks elimination (both automatically, in the compiler, and by programmers, leaving hints to the compiler so it can remove bounds checks).

We used memcpy everywhere in our runtime and after the 10th or so time doing it, it becomes less awkward.
And it’s always reliably optimized out in release builds, I assume?
On platforms thar require aligned loads and stores (not x86 nor ARM), a direct pointer cast sometimes uses an aligned load/store where a memcpy uses multiple byte loads/stores, even on a good compiler, since memcpy() doesn't require that the pointers are aligned. This can be mitigated by going through a local variable, but it gets pretty verbose.
Some ARM CPUs do require aligned loads and stores, such as the Cortex-M0+ in a Raspberry Pi Pico.
Sounds like a good place for a macro?
We have memcpy behind a C++ template function that mimics the interface of std::bit_cast.
For MSVC you have to add "/Oi", otherwise it is always a function call at lower optimisation levels. Clang and GCC treat it as an intrinsic always, even in debug builds.
I haven't manually checked every case but it's normally folded into the load or shift or whatever and completely erased
A few years ago with GCC 10, I had some strange UD2 instructions (I think) inserted around some inline assembly that was fixed by replaced my `reinterpret_cast`s with `bit_cast`.
The proper solution is to use std::bit_cast in modern C++ or otherwise use memcpy, and of course know what you're doing.

Some things that could mess with you:

* Floating-point endianity is not the same as integer endianity.

* Floating-point alignment requirements differ from integer alignment requirements.

* The compuler is configured to use something else than 32-bit binary32 IEEE 754 for the type "float".

* The computer does not use two's complement arithmetic for integers.

In practice, these are not real problems.

Yes, if implemented as shown.

You could use intrinsics for the bit casting & so on and it would be well-defined to the extent that those intrinsics are.

(I understand some people consider SIMD intrinsics in general to be UB at language level, in part because they let you switch between floats & ints in a hardware-specific way.)