Hacker News new | ask | show | jobs
by jchw 2723 days ago
As far as I understand it's also one of the reasons modules are a thing... or at least people want them to be.

Precompiled headers are a pretty ugly solution and the way they've been implemented in the past could be really nasty. (IIRC in old GCC versions it would copy some internal state to disk, then later load it from disk and manually adjust pointers!)

3 comments

There must still be some dark pointer magic going on, because I noticed that unless I disabled ASLR on Debian Stretch, each build of a precompiled header came out different, screwing up ccache. I can only conclude that the specific memory layout during an individual run influences the specific precompiled header (".gch") output. We now run our build process under 'setarch x86_64 --addr-no-randomize.

    $ for i in `seq 3`; do gcc-6 -x c-header /dev/null -o x.h.gch; sha256sum x.h.gch; done
    98d8093503565836ba6f35b7adf90330d63d9d1c76dfb8e3ad1aeb2d933d1a45  x.h.gch
    17e5de099860d94aaa468c5ad103b3f0dd5e663f6cdbd01b4f12cf210023e71c  x.h.gch
    3cc2f1c0a517b5fedbbd49bb3a34084d9aa1428f33f3c30278a8c61f9ed9ba88  x.h.gch
This isn't uncommon, especially for file formats which are meant for internal consumption. Of course, they end up being huge cans of worms in terms of security, stability and maintainability going forward.

Basically, instead of defining a real serialization format (and thus having to write serializer/deserializer code), it's way easier to just `fwrite` out your internal structs to disk, one after another, and write some much simpler walker code to walk through any pointed fields appropriately. At some point though this becomes technical debt which needs to be repaid in the form of a total serialization rewrite.

Blender, the popular open-source 3D modelling tool, uses a format like this for their .blend files, and it is really gross. IIRC a few releases back they started working to improve the format to be a little less dependent on low-level internal details, but now they have the nightmare of backwards compatibility to deal with.

The basic problem is that C/C++ have no mechanism for native serialization, unlike e.g. Java, Python, or any number of other languages, so you're either stuck `fwrite`ing structs or reinventing the wheel.

Which is mostly due to the lack of run-time reflection. OTOH, with a little creativity its possible to create code generators to attach a commonly named (say .serialize method) to classes to dump their POD fields, and call serialize on directly encapsulated classes.

But your basically right, everyone ends up doing it their own way which just ends up being a PITA.

What do you mean by "manually adjust pointers"?
Relocate them.
Why couldn't that be deferred to the linker?
I believe they're talking about gcc's internal state, not the code getting compiled.