Hacker News new | ask | show | jobs
by zeeboo 2883 days ago
I think there's empirical evidence that the SAT-solver approach is not necessary. I have done an analysis on as many Gopkg.{lock,toml} files as I could find, and in no instances did it ever do any non-trivial version selection: the maximal available version at the time was always selected. Additionally, Russ has stated that ~93% of the top 1000 Go packages in the wild build successfully with no changes. I appreciate that they may be convinced, but I think they need to ask themselves what evidence would change their mind. They have had months coming up on a year to figure this out.

https://github.com/zeebo/dep-analysis

2 comments

The more interesting question for me is what evidence would change Russ's mind?

The package managers for many successful languages and distributions use lockfiles and constraint solvers. Not only is that empirical evidence that it works technically, it is evidence that it works socially — users are able to understand and work with it, and the package ecosystems for those languages have evolved with those rules in place.

Empirical data from Go's own package ecosystem is useful too, but you can only learn so much about package management from a corpus that does not have sophisticated package management. The ecosystem has already learned to work within the restrictions so you'll mostly see packages that confirm the system's own biases.

It's like countering passers-by on a bike trail and concluding that the only vehicles users need are bikes.

I'm not saying vgo isn't better. But it's an unproven approach where lockfiles and constraint solving are proven, multiple times over. The burden of proof lies on vgo.

It's clear that a SAT solver is strictly stronger than the MVS approach. In other words, any MVS selection can be encoded in a SAT solver. The argument is for a reduction in power. Thus, in order to convince someone that a SAT solver is preferred over MVS, you must show examples where it succeeds when MVS fails, and the extra power is necessary. It's trivial to contrive these situations, but finding them in practice seems harder. Evidence of that happening would help change my mind, and I'd hope would help change Russ's mind.

The empirical data from Go's package ecosystem is drawn from a corpus with sophisticated package management: dep. The argument is that dep is unnecessarily powerful and that a simpler approach will suffice. The evidence supports that argument. Note that this is not an argument about Cargo, or Bundler, or any thing else. Right now, the ecosystem is using dep, and there is evidence that it can be done simpler.

To stick with your analogy, I think it's fair to conclude that the only vehicles users need on bike trails are bikes.

Additionally, I have done an analysis of two Rust projects that have been brought up in my discussions on this issue. Specifically, LALRPOP and exa. In both cases, throughout the entire history of the project (hundreds of changes over 4-5 years), Cargo only had to select the largest semver compatible version [1]. Again, I would love to find examples of projects where this strategy was not sufficient.

[1] There is one complication: in Cargo, a ^ constraint (the default kind) on a v0 dependency is only good up to the minor version. In other words, ^0.1.0 means >=0.1.0 and <0.2.0, where ^1.0.0 means >=1.0.0 and <2.0.0. Selecting the largest semver compatible version is meant in this way because of the community norms around breakage in v0. In an MVS world, any breaking change is a major version bump, and would have the same properties, but with different version strings.

I feel like this argument is focusing on the wrong thing. It doesn't matter if the entire corpus of go packages have trivial version requirements that don't require to resolve. What seems like a much more important issue is the fact that MVS literally picks different versions of dependencies. Specifically, it picks the oldest satisfiable dependency rather than the newest. And while this simplifies the algorithm, it also has the consequence that you don't get any bugfixes to packages if you haven't explicitly requested the version that includes the bugfixes.

One of the main benefits of semantic versioning is that you can upgrade packages to new minor and patchlevel versions without breaking backwards compatibility, thus allowing you to easily pick up bugfixes by simply updating your dependencies. Of course, you should still test after updating, as packages could introduce new bugs, but on the whole updating minor and patchlevel versions is far more likely to fix bugs than it is to introduce them. But MVS discards this benefit and says you cannot get bugfixes unless you're willing to manually edit your dependency list to declare that you want the newer package. The net result is that packages that use vgo are likely to end up stuck on old versions of dependencies. This is especially true for indirect dependencies. If I publish a library, my incentive is to declare the oldest version of my own dependencies that I'm compatible with, in order to give my upstream user the most control over dependency versions. But if my library's client doesn't know about my own dependencies, then this means my library's dependencies will almost certainly resolve to a really old version, and my library's client won't even know about it and so won't be in a position to request the newer, less-buggy package version. Which then means I have an incentive to instead constantly update my library to list the newest versions of my dependencies, which then forces my library's client to upgrade those dependencies even if my library's client would prefer to be conservative and not upgrade those dependencies simply because they need to upgrade my library.

The first package installer I was aware of that used a SAT solver was SUSE's, back when Yum's solver was extremely primitive and would regularly fail to find a solution.

https://en.wikipedia.org/wiki/ZYpp

I think using a SAT solver for package installation arose out of dealing with much more complex requirements than are likely to arise in a Go project. The Smart package installer used heuristics to find a solution depending upon the operation:

https://bazaar.launchpad.net/~smartpm/smart/trunk/view/head:...

Poetry, Cocoapods, and Dart are all apparently using this SAT solver:

https://github.com/dart-lang/pub/blob/master/doc/solver.md

FWIW, I wrote a package installer that worked with RPMs, Solaris packages and AIX packages about 15 years ago and ended up with a minimal version selection similar to vgo. I wasn't a genius or anything... I just wasn't aware of SAT solvers at the time and it was the simplest thing that worked.