Hacker News new | ask | show | jobs
by marcinzm 2511 days ago
That'd make sense. You'd need to pin all transitive dependencies as well and cache the venv. Not sure if you'd need to find older versions of new direct dependencies to avoid conflicts. For example, you run the sheet in 2015 and then again in 2019 but with a new import. That new import's latest version has a transitive dependency that you already pinned but has different version requirements than what you pinned. I wonder if you can embed the requirements.txt inside the sheet itself.

Only issue is that it'd be really space inefficient, my DS venvs clock in at 400+mb each so having one per sheet will probably quickly become unusable. Which is why I thought of some sort of smart system wide caching akin to maven/ivy. But I'd forgotten how complicated python dependencies (binaries, c code, etc.) were and how little api support pip had.

1 comments

> You'd need to pin all transitive dependencies as well and cache the venv.

I think to do this stuff, you need hooks in Jupyter to setup and teardown the venv before it runs the kernel. (And generally Jupyter would want to clean up unused venvs to mitigate the teardown hook not firing.)

> Only issue is that it'd be really space inefficient

A venv is overkill. You can just run the kernel in a regular directory, and `pip install --target kernel_dir foo-bar-pkg` to put packages directly in it. As long as the linker sees it, third-party libraries will work. This technique is used in the serverless project[1] to bundle dependencies for use on AWS Lambda.

> Not sure if you'd need to find older versions of new direct dependencies to avoid conflicts.

Curation is a solution to this. Stackage[1] is popular in the Haskell community; they build a consistent version set of everything every night and curate stable releases periodically.

With curation, a date and a set of top-level packages is enough to pin your dependencies.

[1]: https://github.com/commercialhaskell/stackage#frequently-ask... [2]: https://github.com/UnitedIncome/serverless-python-requiremen...