Hacker News new | ask | show | jobs
by trumped 2669 days ago
Might be good progress but it still sounds very low to me as I didn't know anything below 100% was possible... it sounds crazy to me (almost like something that was introduced to be able to inject backdoors undetected).
4 comments

Lots of problems come from things like timestamps, or race conditions in concurrent build systems giving slightly different bytes on disk. These generally aren't "trusting trust" level problems, since they do not and cannot affect program behaviour; but they do screw up things like digital signing, cryptographic hashes, etc. which are useful for automatically verifying that self-built artefacts are the same as distro-provided ones.

These problems can also cascade, if component A embeds the hash of another component B, e.g. to verify that it's been given a correct version. If that hash comes from an unreproducible upstream, and building it ourselves gives a different hash, then we'll need to alter component A to use that new hash. That, in turn, changes the hash of component A, which might be referenced in some other component C, and so on.

Nope, loads of build tools were never built with reproducibility in mind.

Look at windows. Even if you fix the compiler and linker, you still non-reproducibility by design, the PE header contains a timestamp.

People also like to stick non-reproducible stuff into builds directly, like timestamps.

Compilers don't have any reason to lay down data in a specific order, so if they are threaded in the backend they just don't.

IDL tools might stick in the timestamp of when a file was generated, for convenience.

and on and on and on.

Every significant project I've worked on embedded the build host and build time in the resulting executable or firmware image. This was along with other static build information, like version number, compiler version and build flags.

Once you make the sensible choice to include build time in the result you've broken reproducibility. Fixing this means tracking down every package that does this and removing the timestamp.

Why is including build time sensible?

If one has reproducible builds, wouldn't a commit/tag from the version control system also do the job of traceability and reproducibility ?

What I've moved to is splating that info into the binaries during the release process. Far as I can tell there aren't standard tools to do that though. At least last time I looked.

Would be nice if there was. I think this is the root of issues such as firmware with the same password/cryto keys across a whole product family instead of unique ones.

That's what coreboot moved to (incl. the timestamp of the commit in its timestamp field) for reproducibility.

Thing is just that host + build time is what was traditionally used. There's no single commit you could use in cvs.

A timestamp is sensible if reproducibility isn't your goal, and exact reproducibility of build artifacts was never a goal on any of my projects. It was simply never a priority.
My guess is that making code reproducible involves some kind of change that hasn’t been applied or all of the code or build files.