Hacker News new | ask | show | jobs
by guitarbill 3743 days ago
Python dependency management is a hard problem, but better than most languages [citation needed]. And `pip` and `setup.py` have emerged over several years, with several influences merged in (remember distutils?).

Honestly, I wish you'd picked a different tag-line though (riffing on `requests` no doubt). Unlike `requests`, your solution only works for a subset of deployment situations, because - as already pointed out - `setup.py` and `requirements.txt` are for different things.

One of the best examples for this I've seen is to use both to deploy to a server with no internet connectivity. For development the dependencies are installed from `setup.py`. Then, before deploying, all dependencies are downloaded via `pip download`. Put the dependencies on the server, finally, use `requirements.txt` with `--no-index` and `--find-links` to install. Definitely an interesting setup, but needs must. Unfortunately, your solution doesn't support `--no-index`, `--find-links` and a few others.

You may want to have a look at tools like `pbr` (Python Build Reasonableness) [1], which has an interesting way of dealing with some hard problems. It also shows how to use `setup_requires` so you don't have to have `requirements.py` hanging around in your repo.

[1] http://docs.openstack.org/developer/pbr/

2 comments

I split my time pretty evenly between the Python and Node ecosystems, and I've found that Python dependency management is extremely poor compared to Node dependency management.

For example, Python's lack of support for nested dependencies means that it can be extremely difficult to use small "utility" libraries like six as subdependencies without running into potential problems from conflicting version requirements from other dependencies.

Additionally, Python's management of different types of dependencies is very weak, specifically with regard to setup.py and requirements.txt. npm has (among other things) very nice explicit concepts of development dependencies and regular dependencies, in addition to application shrink-wraps, which gives library maintainers very easy ways to split out different kinds of dependencies, or lock down all dependencies and subdependencies, using the same set of tools.

While I can say that the current version of pip, especially in conjunction with pip-tools, is significantly better than earlier iterations of Python packaging, I strongly hold that Python packaging is substantially worse than at least one other prominent example.

Interesting, I've never hit this and I though I've deployed some fairly complex apps, but you're right of course. The site-packages approach isn't great, and means you pretty much need virtualenv, especially on e.g. Debian where the system Python is really important (I do love virtualenvs, and for a workaround they're pretty solid). Nice to see npm has this pretty much baked in (is my understanding?).

And don't get me started on eggs. I guess the nice thing about Python is that the packaging genuinely seems to be improving year-to-year, although still limited by design decisions and backwards compatibility.

Anyway, thanks for this; I've had a pretty informative morning searching for "npm vs pip". I'm not a Javascript dev, and when I've tried to use it I've struggled with the sheer bloat, even e.g. npm vs bower. Even though I'm a bit jealous of npm now, I'll take pip over package management in the usual enterprise juggernauts (C, C#, Perl or Java) any day!

npm has a feature that's broadly equivalent to virtualenvs, in that each directory is implicitly a virtualenv, and you use the packages installed in a given directory (or its ancestors). This is a bit less powerful than virtualenvs since it's tied to the directory structure, but in practice it ends up being really convenient for most use cases. Along those lines, there are also some cool affordances for managing dependencies – you can do e.g. "npm install --save <package>" to install a package and update the equivalent of setup.py with the new package, and it will automatically set that up with a semver range that matches the current stable release of that package.

That's actually not what I'm talking about, though, since ultimately those are just DX conveniences. The big difference is that, by default, npm installs dependencies in a nested rather than a flat manner. This means that you can install "library A" and "library B", which both depend on "library C", without worrying about potential incompatibilities between the required versions of "library C". This isn't possible in Python; one consequence is that you see a bunch of libraries that vendor their own small subsets of six to avoid having potential dependency issues – you don't get this in the Node ecosystem, because you can just pull in that dependency and not worry about potential version conflicts.

All of this adds up to a much nicer experience when developing libraries.

Node is quite bad - it encourages a proliferation of small libraries and remove the needs to move the versions forward. This, in turn, leads to a wasteland of old, unsecure, dependencies in the long them.
Thank you! I definitely need to have a look at `pdr`, it looks very interesting.

`requirements.py` doesn't stand to be "the" universal solution to Python dependency problem. It's more like a sugar syntax for "simple" setups.

That's fair. If you could get installation via `setup_requires` to work, so people don't need a (soon to be out-of-date) copy of `requirements.py` in their repo, that'd be a win imo.