Hacker News new | ask | show | jobs
by strokirk 629 days ago
CI providers should definitely start proxying PyPI with their own cache.
3 comments

I wanted to spin up a mirror locally to do simple caching for docker builds but the tooling was lacking, there was a way to do a direct mirror of pypi locally but no other way of adding custom indices
I think Sonatype Nexus [1] can do that relatively easily. I don't know if the OSS version is enough, but I think most people and projects should be fine.

[1]: https://www.sonatype.com/products/sonatype-nexus-oss-downloa...

We've used Nexus OSS just the way you describe and it worked great.

We simply set it up as a kind of "passthrough cache", so if it didn't have the package it fetched it from pypi, and stored it to be used the next time someone wanted to install the same package.

Apart from being nice to pypi, we also got a bit of a decrease in CI runtime, because it fetched packages from the local cache 99% of the time.

DevPi might be your answer I think. Couple of years ago I set it up as a proxy plus hosting my own packages locally.
I'll take a look, I think it is something I looked at and had some issues with but it has been a couple years and the only thing I can remember is bandersnatch
Artifactory?
Pip already does its own caching, but it's maddeningly difficult to even locate and extract files, let alone set up anything usable. It's also needlessly difficult to make Pip just use such a cache directly - for example, if you haven't pinned a version, it will automatically check PyPI to figure out the latest version even if you have cached wheels already.

I don't know (I lack the experience), but I assume that container systems get in the way of Pip finding its normal cache, too. (If they're emulating a filesystem or something, then the cache is in the wrong filesystem unless you reuse the container.)

It might honestly be easier and cheaper for those CI providers to just pay PyPI's bandwidth costs.