|
|
|
|
|
by BiteCode_dev
533 days ago
|
|
At this stage, notebooks should be a GUI powered docker-like image format you download and then click to run. Non programmers using notebooks are usually the least qualified to make them reproducible, so better just ship the whole thing. |
|
Docker Desktop and Podman Desktop are GUIs for running containers on Windows, Mac, and Linux.
containers become out of date quickly.
If programmer or non-programmer notebook authors do not keep versions specified in a requirements.txt upgraded, what will notify other users that they are installing old versions of software?
Are there CVEs in any of the software listed in the SBOM for a container?
There should be tests to run after upgrading notebook and notebook server dependencies.
Notes re: notebooks, reproducibility, and something better than MHTML/ZIP; https://news.ycombinator.com/item?id=35896192 , https://news.ycombinator.com/item?id=35810320
From a JEP proposing "Markdown based notebooks" https://github.com/jupyter/enhancement-proposals/pull/103#is... :
> Any new package format must support cryptographic signatures and ideally WoT identity
Any new package format for jupyter must support multiple languages, because polyglot notebooks may require multiple jupyter kernels.
Existing methods for packaging notebooks as containers and/or as WASM: jupyter-docker-stacks, repo2docker / repo2podman, jupyterlite, container2wasm
You can sign and upload a container image built with repo2docker to any OCI image registry like Docker, Quay, GitHub, GitLab, Gitea; but because Jupyter runs a command execution shell on a TCP port, users should upgrade jupyter to limit the potential for remote exploitation of security vulnerabilities.
> Non programmers using notebooks are usually the least qualified to make them reproducible, so better just ship the whole thing.
Programs should teach idempotency, testing, isolation of sources of variance, and reproducibility.
What should the UI explain to the user?
If you want your code to be more likely to run in the future, you need to add a "package" or a "package==version" string in a requirements.txt (or pyproject.toml, or an environment.yml) for each `import` statement in the code.
If you do not specify the exact versions with `package==version` or similar, when users try to install the requirements to run your notebook, they could get a newer or a different version of a package for a different operating system.
If you want to prevent MITM of package installs, you need to specify a hash for the package for this platform in the requirements.txt or similar; `package==version#sha256=adc123`.
If you want to further limit software supply chain compromise, you must check the cryptographic signatures on packages to install, and verify that you trust that key to sign that package. (This is challenging even for expert users.)
WASM containers that run jupyter but don't expose it on a TCP port may be less of a risk, but there is a performance penalty to WASM.
If you want users to be able to verify that your code runs and has the same output (is "reproducible"), you should include tests to run after upgrading notebook and notebook server dependencies.