Hacker News new | ask | show | jobs
by burfog 2804 days ago
No, (void⁎) is not special in this way. It just makes things look nicer.

The code also isn't undefined behavior... but you are really asking to hit compiler bugs! This is an easy way to confuse gcc into wrongly determining that the code has undefined behavior, and if gcc gets confused then it may determine that a code path can't be taken. Code paths that can't be taken may be deleted.

The main rule here is that memory has a type which is determined by what was last written into it, and you may only read or examine the memory using that type. (for the type, we ignore attributes like the distinction between signed and unsigned) There is a minor exception that is just enough to implement something like memcpy by using a (char⁎) to read and then write as a char. You still aren't supposed to look at that char. These rules apply to memory accessed via pointers, no matter how you cast them, and to memory accessed via union members.

Real compilers differ from that:

Every compiler I'm aware of will not enforce the rules for unions. The gcc compiler promises not to enforce the rules in this case.

Every compiler I'm aware of will let you look at any data that has been read as a char, so the memcpy trick works and you can do things like determine endianness at runtime.

It is legit to initialize a type X variable, take the address of it, cast it from (X⁎) to (Y⁎), pass it through arbitrary data structures and functions to hide the origin from the compiler, cast the (Y⁎) back to (X⁎), and then access the type X variable. If you do this, gcc may generate bad code.

1 comments

Is there a way then to write compliant/non-UB/non-buggy memory allocator/GC in C/C++?
The moment you call sbrk or mmap, you're outside of standard C, so no. Treating a pointer as an integer in order to mess with the bits is also a violation.

Aside from that, the style used here is probably OK. It is hard to say what exactly would trigger the gcc bugs, but I'm pretty sure that a recent gcc would be OK for this code.

> Treating a pointer as an integer in order to mess with the bits is also a violation.

Violation of what exactly? Converting a pointer to an integer, and vice versa, is implementation defined. As long as you're not trying to write implementation-independent code, it's perfectly fine.

Sure, you can convert, but the whole point of converting is to do things that violate the C standard. In theory, a standard-conforming C implementation could have the bits of a pointer be encrypted by the CPU. There is nothing meaningful that you could do to those bits. Rounding pointers up or down for alignment is impossible in standards-conforming C code.