For those interested in uv (instead of pip), uv massively sped up the release process for Home Assistant. The time needed to make a release went down from ~2.5 hours to ~20 minutes. See https://developers.home-assistant.io/blog/2024/04/03/build-i... for details. I'm just a HA user btw.
I would say, as someone who works on performance of pip, no one else was able to reproduce OPs severe performance issue, not saying it didn't happen, just it was an edge case on specific hardware (I am assuming it was this issue https://github.com/pypa/pip/issues/12314).
Since it was posted a lot of work was done on areas which likely caused performance problems, and I would expect in the latest version of pip to see at least a doubling in performance, e.g. I created a scenario similar to OPs that dropped from 266 seconds to 48 seconds on my machine, and more improvements have been made since then. However OP has never followed up to let us know if it improved.
Now, that's not to say you shouldn't use uv, it's performance is great. But just a lot of volunteer work has been put in over the last year (well before uv was announced) to improve the default Python package install performance. And one last thing:
> for a non-compiler language?
Installing packages from PyPI can involve compiling C, C++, Rust, etc. Python's packaging is very very flexible, and in lots of cases it can take a lot of time.
Python is slow compared to Rust, obviously. Beyond that, pip is at this point carrying a bunch of legacy decisions because the ludicrously large number of hard left turns the Python packaging ecosystem has taken over the last 20 years.
Home Assistant is an absolute behemoth of a project, especially with regard to dependencies. Dependency resolution across a project of that size is nuts. There are probably few currently projects that’d see as big an improvement aa HA.
I'm not even sure how their packaging even works. Last I checked they had some kind of extra installer that works at runtime?
I have no idea how they keep version conflicts from breaking everything. Do integrations have isolation or something?
I wish python had a native way to have different libraries in the same project depend on different versions of a transitive dependency, seems like that would make a lot of stuff simpler with big projects.
If no hard versions are given, but only e.g. <= 2.1, pip will download EVERY SINGLE VERSION until 2.1 to look for metadata. That easily can take hours if it happens multiple times.
When testing previous versions of uv, I saw it do that too. But uv uses other tricks to speed things up: it downloads in parallel, it takes advantage of PEP-658 metadata (which doesn't need to download the package) and if that metadata is missing it will next try byte range requests to grab just the metadata of the wheel, and so on. pip was learning some of these tricks in recent releases too.
One problem we have is that support for any repository features beyond PEP-503 (the 'simple' html index) is limited or entirely missing in every repo implementation except warehouse - the software that powers pypi. So if you use artifactory, AWS codeartifact, sonatype nexus, etc, because you are running an internal repository, PEP-658 & PEP-691 support will be missing, and uv runs slower; you may not even have accept-ranges support. (and if you want dependabot, you need to have your repository implement parts of the 'warehouse json api' - https://warehouse.pypa.io/api-reference/json.html - for it to understand your internal packages)
I've been playing with https://github.com/simple-repository/simple-repository-serve... as a proxy to try to fix our internal servers to suck less; it's very small codebase and easy to change. Its internal caching implementation isn't great so I wrapped nginx round it too, using cache agressively and use stale-while-revalidate to reduce round trips, it made our artifactory less painful to use, even with pip.
pip will not do that, it will attempt to use the latest version under the user requirements, only if there is a conflict between two packages will it backtrack on old versions of a package, uv does exactly the same.
Further, if a package index supports PEP 658 metadata, pip will use that to resolve and not download the entire wheel.
uv does the same but adds extra optimizations, both clever ones that pip should probably adopt, and ones which involve assumptions that don't strictly comply to the standards, which pip should probably not adopt.
I’m only a casual Python user. But wtf was it doing and why did it take so long? That’s bonkers.