Hacker News new | ask | show | jobs
by drcongo 2214 days ago
Every attempt to solve this problem in Python seems to eventually end up in a pretty terrible place. Pipenv got off to a great start but got slower and slower to the point that it was more painful to use than not. Poetry (which is still my preferred option) started off with something seemingly beautifully thought through, and very fast too. But after only a few version updates, it seems to be hitting the same problems Pipenv did. On one project I was working on recently I managed to screw up the Poetry.lock file, so I ran `poetry lock` and it took 18 minutes. I still have high hopes for poetry, but I spend way more time trying to work around its shortcomings now (v1.0.5) than I did when it was at version 0.10.0 two years ago.
2 comments

Poetry is better than Pipenv by a mile. It solves almost all of the problems, and the remaining ones are already on the Poetry roadmap.
Oh, totally agree. But it also seems to have a lot more problems than it did two years ago.
I'd chalk this up to dependency management and resolution being a hard problem.

Ruby's bundler had these exact same issues 5 or so years ago. I remember attending a talk by on Bundler run by it's core devs and asking about how they make dep resolution faster. Turns out that it was never really a solved problem there either, Bundler just uses a bunch of heuristics to avoid cases like the 18 minute `Pipefile.lock` described above.

I get that resolving dependencies is a SAT problem and inherently intensive; however, I don't understand why it's so much slower in Python. Is it just that all of these resolvers are implemented in Python (and Python is really that much slower than other languages?), or does Python require you to download an entire package just to determine its dependencies? In the latter case, that seems pretty dumb, right? Like as bad as exposing the entire interpreter as the extension interface, rendering optimizations and competing interpreters virtually impossible.
> does Python require you to download an entire package just to determine its dependencies?

yes - the standard way of defining dependencies in Python is in setup.py, which has to be invoked as a Python script in order to work. this script may also need to read files from the rest of the project, so you do indeed need to download the whole package to determine its dependencies.

even if the Python community were to agree on a new configuration format tomorrow, there would still be a ton of packages out there that wouldn't migrate for years.

It seems that this information should cacheable after an invocation of setup.py, at least for an installation without any extras. And even with extras requested, perhaps.

Or is there any even greater hidden challenge from using setup.py?

setup.py can check the OS and pick necessary requirements. so it can have different dependencies in different OSes.

I've used it like this - https://github.com/JaDogg/pydoro/blob/b1b3de38ac15b9254ef1be...

Does this imply that poetry will "solve" dependencies quicker the more of the dependencies use `pyproject.toml`? Or is that "hidden" once something is sent to PyPI anyways?
With several asterisks attached, but yes, the move would make dependency resolution faster.
That's why conda stores the metadata in a repodata file. Solving dependencies happens first; then binary packages are downloaded.