Hacker News new | ask | show | jobs
by catern 2073 days ago
Note that the source code is referenced by a hash, so it can't change without changing the package. Also, the source code of all packages built by Hydra is on cache.nixos.org alongside the resulting binaries.
2 comments

True, which is absolutely sufficient for most use cases.

I'm currently doing some work for ML and data science companies where full reproducibility and introspection is very much desired.

So you need to run your own source cache to provide that guarantee, because you can't count on cache.nixos.org still providing the source code from a package built 4 years ago.

But that's why I love the IPFS cache efforts. [1] Running your own node to pin all required sources should then be relatively easy.

[1] https://blog.ipfs.io/2020-09-08-nix-ipfs-milestone-1/

Software Heritage is also helpful here, and Guix is integrating with it - see e.g. https://guix.gnu.org/blog/2019/connecting-reproducible-deplo...
Nix is also integrating with the Software Heritage:

https://www.tweag.io/blog/2020-06-18-software-heritage/

Debian's archive reaches back 15 years now: https://snapshot.debian.org/

It also contains source code.

Are those sources target independent or specific for each new build of the package? That is, is there a new source code package on hydra when one of the dependencies changes? Or does it only change if the package itself changes?

Also, they are only available when hydra builds the package anyways, right? So if some package is not built by hydra (like how it used to be for the texlive packages), it'll still download the sources from the various places they are hosted.

As for the hash, it's good that the source code is hashed, but my main concern was that it was downloading from external sources in the first place. This is bad for privacy, as those hosts know I'm downloading from them, as well as for reliability, because the hosts might not have as good uptime as a debian package mirror.

A sibling comment replied to your first paragraph, so just about the second two:

>Also, they are only available when hydra builds the package anyways, right? So if some package is not built by hydra (like how it used to be for the texlive packages), it'll still download the sources from the various places they are hosted.

Yes.

>As for the hash, it's good that the source code is hashed, but my main concern was that it was downloading from external sources in the first place. This is bad for privacy, as those hosts know I'm downloading from them, as well as for reliability, because the hosts might not have as good uptime as a debian package mirror.

That's a true and valid concern, but note that it's the same situation as with Debian: If the package is built upstream by Debian/the NixOS Hydra instance, then you have reliable, private access to its source code so you can rebuild it. If it's not built/packaged upstream, then you need to get the source from somewhere else.

The discrepancy is just that there's packages in Nixpkgs which are not built upstream, and which get built only locally on your machine or your own Hydra instance. There are not many of these, but yeah, it would be nice to fully get rid of them.

Or, an interesting option would be to build the source for more packages on Hydra, without actually building the binary for the package. That wouldn't be too hard, if someone adds an expression for doing it.

> That's a true and valid concern, but note that it's the same situation as with Debian

Good point!

> an interesting option would be to build the source for more packages on Hydra, without actually building the binary for the package. That wouldn't be too hard, if someone adds an expression for doing it.

Yes, that would be awesome!

Are those sources target independent or specific for each new build of the package? That is, is there a new source code package on hydra when one of the dependencies changes? Or does it only change if the package itself changes?

Fixed-output derivations are used for sources (they are content-addressed in the store), so the latter.