Hacker News new | ask | show | jobs
The Python Package Cache (blog.replit.com)
39 points by dstowell 1913 days ago
5 comments

This may be a question about the existing UPM rather than the new thing in this post (the cache), but it's not clear to me how this system handles versioned dependencies or reproducibility issues. import statements (in python) are not versioned. Does anyone know?
I also wonder how they solved the stupid but annoying problem of mapping import names to package names, e.g. "import sklearn" -> scikit-learn, "import dateutil" -> python-dateutil, "import bs4" -> beautifulsoup4, ...
pipreqs' mapping was my first guess:

https://github.com/bndr/pipreqs/blob/master/pipreqs/mapping

Although it looks like they may have expanded beyond pipreqs:

https://github.com/replit/upm/pull/4

UPM's philosophy is to use a lockfile to specify dependency constraints. The first time you press run and UPM guesses which packages satisfy which import statements, those versions are put into the lockfile.
Can the user read/export the lockfile in a portable format (e.g. requirements.txt)? I love the idea of magic like this, but I'm less keen if it comes at the price of lock-in. (And feel free point me to the docs!)
No magic, it's automating and hooking into existing open-source tools. For Python its poetry (https://python-poetry.org/), not requirements.txt because UPM needs to present strong guarantees on reproducibility -- otherwise things like content-addressable caching wouldn't be possible. Poetry is open-source and UPM is too: https://github.com/replit/upm

Every Replit Python project can be downloaded and you'll have the spec file and the lock file so you can install the same dependencies locally.

This is awesome. Installing packages was already fast, so this is just icing on the cake!
Is this similar to conda, then?
Sounds like this is Linux-only?
Oh wow! They talked about OverlayFS so I wondered what they'd do on other platforms.
BRB, typosquatting everything.