Hacker News new | ask | show | jobs
by duped 540 days ago
My experience with conda is that its fine if you're the original author of whatever you're using it for and never share it with anyone else. But as a professional I usually have to pull in someone else's work and make it function on a completely different machine/environment. I've only had negative experiences with conda for that reason. IME the hard job of package management is not getting software to work in one location, but allowing that software to be moved somewhere else and used in the same way. Poetry solves that problem, conda doesn't.

Poetry isn't perfect, but it's working in an imperfect universe and at least gets the basics (lockfiles) correct to where packages can be semi-reproducible.

There's another rant to be had at the very existence of venvs as part of the solution, but that's neither poetry or anaconda's fault.

3 comments

Poetry is pretty slow. I think `uv` will ultimately displace it on that basis alone.
For what it’s worth – A small technical fact:

It is entirely possible to use poetry to determine the precise set of packages to install and write a requirements.txt, and then shotgun install those packages in parallel. I used a stupidly simple fish shell for loop that ran every requirements line as a pip install with an “&” to background the job and a “wait” after the loop. (iirc) Could use xargs or parallel too.

This is possible at least. Maybe it breaks in some circumstances but I haven’t hit it.

That poor package server getting 39 simultaneous pulls at the same time from one user.
This is indeed something to consider!

Not as an excuse for bad behavior but rather to consider infrastructure and expectations:

The packages might be cached locally.

There might be many servers – a CDN and/or mirrors.

Each server might have connection limits.

(The machine downloading the packages miiiiiight be able to serve as a mirror for others.)

If these are true, then it’s altruistically self-interested for everyone that the downloader gets all the packages as quickly as possible to be able to get stuff done.

I don’t know if they are true. I’d hope that local caching, CDNs and mirrors as well as reasonable connection limits were a self-evident and obviously minimal requirement for package distribution in something as arguably nation-sized as Python.

And… just… everywhere, really.

I actually can't believe how fast `uv` is.
Ditto. It’s wild.
Poetry is a pain. uv is much better IME/IMO.
Can you recommend any good article / comparison of uv vs poetry vs conda?

We've used different combinations of pipx+lockfiles or poetry, which has been so far OK'ish. But recently discovered uv and are wondering about existing experience so far across the industry.

From my experience, uv is way better and it's also PEP compliant in terms of pyproject.toml. Which means in cas uv isn't a big player in the future, migrating away isn't too difficult.

At the same time, poetry still uses a custom format and is pretty slow.

I wrote an overview, but didn't post benchmarks https://dublog.net/blog/so-many-python-package-managers/
How is uv so much faster? My understanding is Poetry is slow sometimes because PyPi doesn't have all the metadata required to solve things, so it needs to download packages and then figure it out.
If I recall correctly, uv is doing some ninja stuff like guessing the part of the relevant file that is likely to contain the metadata it needs and then doing a range request to avoid downloading the whole file.
Thanks, that makes sense. I guess Poetry could add that if they liked.
+1. On top of that, even with the new resolver it still takes ages to resolve a dependency for me, so somethimes I end up just using pip directly. Not sure if I am doing something wrong(mb you have to manually tweak something in the configs?) but it's pretty common for me to experience this
Like sibling comments, after using poetry for years (and pipx for tools), I tried uv a few months ago

I was so amazed of the speed, I moved all my projects to uv and have not yet looked back.

uv replaces all of pip, pipx and poetry for me, I does not do more than these tools, but it does it right and fast.

If you're at liberty to try uv, you should try it someday, you might like it. (nothing wrong with staying with poetry or pyenv though, they get the job done)

I believe the problem is the lack of proper dependency indexing at PyPI. The SAT solvers used by poetry or pdm or uv often have to download multiple versions of the same dependencies to find a solution.
imagine being a beginner to programming and being told "use venvs"

or worse, imagine being a longtime user of shells but not python and then being presented a venv as a solution to the problem that for some reason python doesn't stash deps in a subdirectory of your project

You don't need to stash deps in a subdirectory, IMHO that's a node.js design flaw that leads to tons of duplication. I don't think there's any other package manager for a popular language that works like this by default (Bundlers does allow you to version dependencies which can be useful for deployment, but you still only ever get one version of any dependency unlike node).

You just need to have some sort of wrapper/program that knows how to figure out which dependencies to use for a project. With bundler, you just wrap everything in "bundle exec" (or use binstubs).

What was unique to node.js was the decision to not only store the dependencies in a sub-folder, but also to apply that rule, recursively, for every one of the projects you add as a dependency.

There are many dependency managers that use a project-local flat storage, and a global storage was really frowned upon until immutable versions and reliable identifiers became popular some 10 years ago.

Wasn't node the only programming language that used a subdirectory for deps by default?

Ruby and Perl certainly didn't have it - although Ruby did subsequently add Bundler to gems and gems supported multiversioning.

It’s fairly common for Perl apps to use Carton (more or less a Perl clone of Bundler) to install vendored dependencies.
Oh that's nice. When I last looked (quite a long time ago), local::lib seemed to be the recommended way, and that seemed a bit more fiddly than python's virtualenv.
Carton uses local::lib under the covers. I found local::lib far less fiddly than virtualenv myself, but it just doesn't try to do as much as virtualenv. These days I do PHP for a living, and for all the awfulness in php, they did nail it with composer.
Rust, julia, elixir
julia just store the analogue of a requirements.txt (Project.toml) and the lock file (Manifest.toml). And has its own package issues including packages regularly breaking for every minor release (although i enjoy the language and will keep using it)
yep, i was wrong about julia.
All those came after Python/C/C++ etc which were all from the wild-west of the "what is package management?" dark ages. The designers of those languages almost certainly thought the exact thought of "how can we do package management better than existing technology like pip?"
Rust doesn't store dependencies under your project dir, but it does build them under your target.
I have imagined this, because I've worked on products where our first time user had never used a CLI tool or REPL before. It's a nightmare. That said, it's no less a nightmare than every other CLI tool, because even our most basic conventions are tribal knowledge that are not taught outside of our communities and it's always an uphill battle teaching ones that may be unfamiliar to someone from a different tribe.
It is true that every field (honestly every corner of most fields) has certain specific knowledge that is both incredibly necessary to get anything done, and completely arbitrary. These are usually historical reactions to problems solved in a very particular way usually without a lot of thought, simply because it was an expedient option at the time.

I feel like venv is one such solution. A workaround that doesn’t solve the problem at its root, so much as make the symptoms manageable. But there is (at least for me) a big difference between things like that and the cool ideas that underlie shell tooling like Unix pipes. Things like jq or fzf are awesome examples of new tooling that fit beautifully in the existing paradigm but make it more powerful and useful.

Beginners in Python typically don't need venvs. They can just install a few libraries (or no libraries even) to get started. If you truly need venvs then you're either past the initial learning phase or you're learning how to run Python apps instead of learning Python itself.

For some libraries, it is not acceptable to stash the dependencies for every single toy app you use. I don't know how much space TensorFlow or PyQt use but I'm guessing most people don't want to install those in many venvs.

Intelligent systems simply cache and re-use versions and do stash deps for every toy project without consuming space.

Also installing everything with pip is a great way to enjoy unexplainable breakage when a Doesn't work with v1 and b doesn't work with v2.

It also leads to breaking Linux systems where a large part of the system is python code. Especially where user upgrades system python for no reason.

If you install a package in a fresh environment then it does actually get installed. It can be inherited from the global environment but I don't think disparate venvs that separately install a package actually share the package files. If they did, then a command executed in one tree could destroy the files in another tree. I have not done an investigation to look into this today but I think I'm right about this.
In better designed systems than python they do. To share them with python you need something with dedup. Eg BTRFS ZFS
Python's venv design is not obviously unintelligent. It must work on all sorts of filesystems, which limits how many copies can be stored and how they can be associated. More advanced filesystems can support saving space explicitly for software that exploits them, and implicitly for everyone, but there is a cost to everything.
i remember reading somewhere (on twitter iirc) an amateur sex survey statistician who decided she needed to use python to analyze her dataset, being guided toward setting up venvs pretty early on by her programmer friends and getting extremely frustrated.
Was it aella? I don't know of any other sex survey statisticians so I'm assuming you mean aella. She has a pretty funny thread here but no mention of venvs: (non-musk-link https://xcancel.com/Aella_Girl/status/1522633160483385345)

  Every google for help I do is useless. Each page is full of terms I don't understand at *all*. They're like "Oh solving that error is simple, just take the library and shove it into the jenga package loader so you can execute the lab function with a pasta variation".
She probably would have been better off being pointed towards jupyter, but that's neither here nor there
Good grief there seems to be no getting away from that woman. One of my ex girlfriends was fascinated by her but to me she is quite boring. If she wasn't fairly attractive, nobody would care about her banal ramblings.