Hacker News new | ask | show | jobs
by angrygoat 2411 days ago
The big gap is management of the full dependency tree. With yarn I can get a package.lock which pretty well ensures I'll have the same exact version of everything, with no unexpected changes, every time I run yarn install. I get the same thing in the Rust world with Cargo.

In Python it's a mess. Some packages specify their deps in setup.py; some in a requirements file, which may or may not be read in by their setup.py. It's not rare to have to have multiple 'pip install' commands to get a working environment, especially when installing fairly boutique extensions to frameworks like Django.

There just isn't a holistic, opinionated approach to specifying dependencies, especially at a project level. Which leaves unexpected upgrades of dependencies (occasionally leading to regressions) as a reality for Python devs.

4 comments

There are two new tools in the Python ecosystem, which try to fill the gap left by cargo, npm, yarn & co.:

One is pipenv [0], which works similar to yarn & co. It uses Pipfile/Pipfile.lock to define and lock dependencies. Pipenv has a major flaw: It can't be used to publish packages on pypi.org (you still need twine & setup.py for that). It's also known for being slow and somewhat buggy. Despite all that pipenv is an "official" tool maintained by the "Python Packaging Authority".

The other one is poetry [1], which works exactly like yarn & co. It uses "pyproject.toml" to specify dependencies and "poetry.lock" to lock them. Poetry does most of the things "right", but it's still an underdog compared to pipenv.

Both tools have not yet fully matured, thus there are a lot of complaints.

[0]: https://github.com/pypa/pipenv

[1]: https://github.com/sdispater/poetry

There's good news on that front: the Python Packaging Working Group has secured >$400K in grants to improve pip's dependency resolver: https://twitter.com/di_codes/status/1193980331004743680
I've worked on tons of small to medium-small Python projects over the years where I didn't fix dependency versions at all, not even major versions, just a requirements.txt with a list of package names (usually it's a list of maybe at most ten well-known libraries, resulting in at most twenty actual packages pulled from PyPI). Come back three years later, pull the latest versions of everything, code still works fine.

Now try that with JavaScript or Rust. If you don't fix versions, come back three months later and compatibility is usually fucked up beyond all recognition.

Some languages embraced better dependency locking because they absolutely couldn't not solve the problem.

I’ve only recently started working with Python, and I’ve already been bitten by TensorFlow v1 and v2 packages having different APIs, so the viability of that approach will depend heavily on which packages you use.

However in SemVer a major version number change is how breaking changes are documented, so seeing a v1 to v2 change coupled with having to do some work to fix breakage is just expected, something that may well be necessary for a project to make progress.

ML, pydata etc. are really worlds apart from more traditional Python ecosystems; guido himself a couple years back admitted he had no idea about those silos and sat down with some leaders of those communities to hear their needs. Those communities tend to have their own recommendations and best practices.

My very brief exposure to TF seems to suggest that dev environments surrounding TF are way harder to set up than my “list of bare packages in a requirements.txt file” scenario which is sufficient for a lot of more traditional endeavors.

> With yarn I can get a package.lock

pipenv generates a Pipfile.lock - if it can, I've used it primarily for Airflow, and some packages within Airflow have incompatible version ranges for the same dependency, which means it can't generate the lock file.

Oh, thanks - I'll give it a whirl!