Hacker News new | ask | show | jobs
by fomojola 2190 days ago
I can answer this one. Sometimes you have lines like this:

    FROM ubuntu:focal
    RUN apt-get -y install libssl-dev
    <your app details>
Since libssl-dev gets periodically updated (security updates and whatnot) if you build this now and build it again in 1 year you're very probably not going to get the same OpenSSL version. So it MIGHT be reproducible, but can easily give you different results depending on updates to the packages and the way your Dockerfile imports external dependencies. And that's before we even mention updates to the base container image.

Of course, you can refer to a specific container image id and pin all your packages, which would go a long way to improving reproducibility.

1 comments

So it's wrong (or rather uninformed) usage of Docker that leads to this, the tech itself is sound and does guarantee reproducibility.
Yes, if you have constant inputs, it will produce mostly constant outputs (timestamps, etc).

But once you start getting time differential and caching, you run into... if not bugs, faults - your cached image came from a month ago, apt is now invalid. Sure, I should've used a private dpkg repository, but I want the side-effects from all of those.

A docker image I built today, is very likely to be ostensibly the same as the one I build in five minutes, but it doesn't guarantee that because its cache is at the image-level and the externalities (see: every possible step that isn't an ADD or COPY) is variadic.

It does a very valiant attempt at consistency to be clear, and I have my own gripes about it, but it's not like it creates a new planet from which technology can be built upon; It's more like creating a moon on which you can build a base, but you're still subject to the orbit of the earth.

> the tech itself is sound and does guarantee reproducibility

Does the tech guarantee reproducibility if you can use it to create un-reproducible artifacts? I don't think Docker claims anywhere to guarantee reproducible builds...

Docker guarantees run reproducibility. It does not guarantee build reproducibility although it does make it easier to achieve.
It does an implicit guarantee though. You can match the image digests and it does verify that the image is exactly the same as you intended it to be. Docker also has trust signing now to make additional guarantees. Are you trying to claim that it's possible to have same image digests but the content of images is different?
You could say it guarantees initial reproducity -- the docker image itself stays constant -- not that it guarantees complete reproducity. I would imagine nix is the solution for that goal, though I don't know nix well enough to be confident.

But an app built in a docker container is not guaranteed to have reproducible builds, because docker doesn't say anything about what happens beyond loading the initial image

NixOS does the same thing. If you tell it to load python3 it will load the latest 3.7.7. You need to explicitly tell it which version of Python3 you want, just like in Dockerfile.
That's if you're installing from a channel, because channels get updated.

If you use a local clone/submodule of nixpkgs (the git repository with the definition of all nixos packages) at a specific commit, then you will always install the exact same software, because definitions in nixpkgs all specify the exact version and the hash of all the inputs.

The problem is the implicit guarantee:

- Step 3: pip update && install X

- Step 4: run step-tool

- Step 5: pip update && install Y <build failure>

To be clear, I'm sure all these things can be solved by a complex enough stream of shell commands, but I'm also forced to shove in updates at every step of my build, which is an artifact of the build-system

Docked doesn’t guarantee reproducibility anymore than a tar file in that context.