Hacker News new | ask | show | jobs
by vishvananda 3034 days ago
I have been using nix for a while to build binary packages for crashcart[1] and I really love the premise of isolated source-based builds.

Unfortunately, over time I've become quite frustrated with the pull-everything from the internet model. If you are building packages from scratch each time instead of pulling down cached version using the nix command, the build breaks quite often.

Mostly it is silly stuff like the source package disappearing from the net. A particularly egregious offender is the named.root[2] file being updated, which will cause everything to fail to build until the sha is updated in the build def.

I don't know that there is a great solution for this problem. Maybe there needs to be a CI system that does from scratch build of all of the packages every day and automatically files bugs. Alternatively, a cache of sources and not just built packages could ease the pain. This issue probably affects ver few nix users, but it has demoted my love of nix down to "still likes but is somewhat annoyed by".

[1]: https://github.com/oracle/crashcart [2]: https://www.internic.net/domain/named.root

3 comments

Regarding disappearing sources: Nix offers a content-addressed mirror for sources downloaded by the Hydra CI system. As a random example, here is the latest Chromium source tarball:

http://tarballs.nixos.org/sha256/3dfa02e873ff51a11ee02b9ca39...

So disappearing sources is not a huge problem in my experience. Obviously if you have package declarations outside of Nixpkgs proper things are different.

This problem is also something the Software Heritage project[0] aims to solve, but I don't think they have a good API yet.

[0] https://www.softwareheritage.org/

Yeah I'm aware of the build mirror, but crashcart builds things under a different prefix so it must build things from source each time. I realize this doesn't affect most users (which is why it takes a while for disappearing sources to be found and fixed). A content addressed mirror for sources as well would solve the problem nicely.
I'm sorry, the mirror is actually for sources, not build artifacts. I've updated my comment to clarify.
When was that added? Well fetchurl automatically poke it if it can't find the sources otherwise?
interesting I wasn't aware of that. I wonder why my builds are not using it.
It might be related to using a different Nix prefix for your builds, which is a little poorly supported. Just curious: Why are you using a different prefix?
The point of crashcart is to be able to side load a filesystem with utilities into a running container. It is very important that the location we pick doesn't conflict with any path already in the container. If we used /nix as the mount path it would conflict with any container that uses nix. In order to prevent this (probably rare) conflict, we build our utilities in /dev/crashcart/ instead.
What about running your own caching HTTP proxy for your build's external dependencies or else pulling static copies of these dependencies into your own repo? It seems like the problem isn't building everything from source but rather that the sources of truth for the inputs are unreliable. You'd have the same problem trying to build e.g. Debian from scratch if you couldn't reliably pull down all the sources for things.
A caching http proxy would help me build the same things reliably, but it wouldn't help anyone else who cloned my repository and attempted to do their own build unless I also gave them access to the proxy. And it hides the fact that the standard from-scratch build doesn't actually work. I think the cached nix packages is why it takes so long for some of these issues to be discovered.

The difference with Debian (and other linuxes) is that the source code for the build is also provided. The upstream source might be from random place on the net, but Debian provides a source package that you can use to rebuild the binary package. Maybe the solution is for nix/nixos to provide something similar.

I've not tried Nix but this seems like a huge oversight if there is no local cache?

Not everyone is blessed with unlimited gigabit fiber.

Edit: Seems like I was wrong, and I'm happy about it. :)

Generally everything is cached that you build/download. But only on the machine you do it on. That's why you usually want a collective cache inside your org additionally, because not all machines will have everything yet.
If you use a content-addressable scheme (fetchurl with sha256, for example), it will retain the source archives until you run garbage collect.

If you use builtins.fetchTarball, I don't think this is the case.

Since this all uses CAS, you can use the nix prefetch scripts to import an arbitrary file:// or other URI into the nix store.

fetchTarball now also supports an optional sha256 argument. It'll then be used indefinitely without checking for changes after the TTL expires.