Hacker News new | ask | show | jobs
by quietbritishjim 1702 days ago
Aliasing with unsigned char* is also allowed. (Oddly, signed char is not, even if char is signed.)

uint8_t is not guaranteed to be unsigned char but in practice almost always is. GCC did originally have separate 8 bit types when stdint.h was introduced but quickly changed to a typedef for char-based types precisely to allow using it for aliasing.

Yes, technically char may not be 8 bits but in practice that is very rare (and you can statically assert it).

Overall IMO the best solution is always use uint8_t and turn off optimisations on those rare weird platforms where it's not an alias for unsigned char for whatever reason.

2 comments

I don't understand, if you're trying to work around aliasing restrictions why would you use `uint8_t*` in the first place?

By definition `sizeof(char) == 1`, so that's almost always what you want when messing with types in C anyway. What you want is bytes, not octets.

chars can be two octets on some DSP platforms that lack byte addressability.
sizeof(char) must always be 1, regardless of how many bits or octets that represents. On such a platform, uint8_t does not exist.
> uint8_t is not guaranteed to be unsigned char but in practice almost always is.

Does this imply any unsigned char typedef is able to alias anything? Or is uint8_t a compiler special case?

`char` is not guaranteed to be 8 bits and in some more exotic environments (DSPs for instance) may not be.

IIRC POSIX guarantees that char is 8 bits though (but I still think that the sign is implementation-dependent).

But as I said in a parent comment, I don't understand why it's even relevant. If you want to alias any type then use `char *` and not anything else. I don't understand why one would prefer using stdint for that.

Because sometimes people need to reinterpret data as an array of 8/16/32/64 bit elements. Sometimes people also need to reinterpret things as a structure.

This is independent of how many bytes the underlying platform can address. If we have 8 bit processing code but the platform can only address 16 bits at a time, it should be up to the compiler to generate code that works. Compilers already do stuff like that in other circumstances.

Those people need to be copying, otherwise the reinterpretation might not be working. The char data might not be correctly aligned, for example. Recently went through a big nightmare where a C++ codebase that had accreted on x86 was thought to be ported to another platform where alignment actually matters and there were all manner of rare low-level malfunctions stemming from the idea that you can just wantonly cast a char* to structured data.
The strict aliasing rules look through typedefs (and const/volatile-qualifiers). It's possible to use char, signed char, and unsigned char, or any typedef thereof, or any typedef of any typedef, etc., to access any memory whatsoever.