Hacker News new | ask | show | jobs
by BeeOnRope 1567 days ago
Yes, this will be 16 bytes on Arm, and everywhere else pretty much. That doesn't mean the whole outer struct can't also be 16 bytes (this struct is one member of a union).
2 comments

Yea but here is my problem.

    uint8_t  type : 2  // tag = {SSO|OWN|REF|VUE}
I had thought the tag was part of the SSO, but it’s not. Well, not accessible but I think that what some of the unused bits end up being?
I also find that questionable. The compiler is allowed to pull a bit field from low or high bits (i.e. not specified in the standard) and it might be further affected by endianness.

This code is assuming that on little-endian, gcc pulls 62 bits from the low bits of the word, and on big-endian that gcc pulls from the high 62 bits of the word. It might actually be what gcc does, but I don't think any of that is guaranteed.

(author here)

That's concerning.. Should I rather use a full 64bit data field + masking ? I'm a bit naive about portability.

So a disclaimer, I have never read the C standard in full and I don't consider myself an expert, but this one about bit fields is one of many anecdotes I've accumulated over the years about portability. The rule is never use a bit field in a serialized file format because different compilers could interpret them differently. As long as a bit field never leaves the memory of the host that wrote it, they're fine. But in this case, you are expecting the bit field to overlay in a specific manner with a union, so I think you run into the same problem.

Making a library more portable in C usually means making the code an ugly mess of macros. So, first decide how much time and code clarity you want to sacrifice vs. the systems you want to be able to run it on. I have a large scale personal project data encoding library that I never finished in part because I tangled myself up in trying to make it 100% defined behavior C code. The world only has 3 major compilers: gcc, clang, and visual studio. If your code works on all 3, and its a hobby project, that's probably good enough. There are also vanishingly few big-endian systems left, aside from some embedded ARM processors. Also, declaring :62 will be a compile error on a 32 bit system, so that's maybe your biggest portability concern? But again, 32 bit is either old or embedded, so maybe not worth worrying about.

I believe that to make this portable to big-endian or 32 bit, you would need to make a macro to access the pointer and flags. Then you need to detect the endianness in preprocessor directives and set the macro to either a shift or a mask.

That was very helpful. I'll skip the endianness and try to resolve the hardcoded :62 part.
(author)

Should I add __attr(packed) to be on the safe side ?

I probably would. Or at least a comment about packed. I can’t remember when __attr(packed) works and when it doesn’t.

You probably don’t preprocessor hell (see xxHash if you don’t know what I mean), so maybe just determine one way or any other and leave a comment somewhere.