Hacker News new | ask | show | jobs
by jstimpfle 10 days ago
struct layout is well specified, it should be possible to avoid any padding issues by just aligning and by padding (with dummy members) correctly. The problem in practice is mostly integer representation (big-endian vs little-endian).
3 comments

Specified by whom? Not the C standard for sure. It is indeed soecified by individual ABIs, and ABIs don't tend to do anything too weird, but that's another question.
looks like I was wrong, but here is the de-facto standard I was relying on over the years ;-). Not that I've memcpied many structs to file directly btw. http://www.catb.org/esr/structure-packing/
The general struct layout algorithm is that you lay out the first member at the address of the struct (this is guaranteed by C), and then subsequent fields in order (also guaranteed by C). What isn't guaranteed is how fields get their alignment, in particular shenanigans you can do with allocating fields in the padding of their prior field, and bitfields in general are horribly underspecified.

In practice, C doesn't do any padding shenanigans, but C++ does (but only for non-POD structs, and then you discover there's several slightly different definitions that mean basically "POD", so have fun predicting which one is the one that actually matters for your use case).

If you sort your fields by size or manually pad them with natural alignment, and use #pragma pack or equivalent non-standard directives that gets you most of the way there. But yes, avoid bitfields.

C++ "standard layout type" is the modern equivalent of "POD" I think.

GP probably meant "POD for the purpose of layout" in the Itanium ABI. It's not the same as standard layout, and POD is not a term in recent C++ standards.
> struct layout is well specified

Technically that's not true at least for booleans and enums, the C standard doesn't define specific sizes for those (bools are commonly 1 byte though, but for enums at least MSVC likes to disagree with Clang and GCC).

Using a direct struct memory layout for persistency and then expecting it to work across compilers, CPUs and ABIs is almost guaranteed to cause problems.

If you modify or even just move fields around the struct that also changes the way they are serialized...

You really need a serializer for this sort of thing because it can also include forwards compatibility of your data structures.

sure, if you change the struct, it will now be different.
It's typical to only append fields when you do this.