Hacker News new | ask | show | jobs
by goerz 2972 days ago
Notebooks can be reproducible, they just aren’t automatically so. It requires a little bit of effort and discipline, if reproducibility is a goal. https://www.svds.com/jupyter-notebook-best-practices-for-dat... is an excellent starting point. Personally, I use notebooks to keep a record of large computational pipelines. The key is to cache all results to disk. This allows for an iterative process where I modify the notebook, kill the kernel, and rerun everything. Only new calculations will be executed, everything previously calculated will simply be loaded from disk. In the end, I have a reproducible record of the entire project (and rerunning the notebook is fast) This kind of make-like functionality is implemented through the doit Python package (http://pydoit.org). An example workflow for this is http://clusterjob.readthedocs.io/en/latest/pydoit_pipeline.h...
1 comments

So basically you write a script instead of a notebook? If you save data on disc, are they still displayed with the rich formating of Jupyter?
Well, it also contains markdown comments (often with LaTeX formulas, so the graphical rendering is appreciated), and, most importantly, plots of the results (which are typically fast to generate, so they are not cached).