| I work with a bunch of 'data scientists' / 'strategists' and the like who love their notebooks but it's a pain to convert their code into an application! In particular: * Notebooks store code and data together, which is very messy if you want to look at [only] code history in git.
* It's hard to turn a notebook into an assertive test.
* Converting a notebook function into a python module basically involves cutting and pasting from the notebook into a .py file. These must be common issues for anyone working in this area. Are there any guides on best practices for bridging from notebooks to applications? Ideally I'd want to build a python application that's managed via git, but some modules / functions are lifted exactly from notebooks. |
The main point of friction is that the "default" format for storing notebooks is not valid, human readable python code, but an unreadable json mess. The situation would be much better if a notebook was stored as a python file, with code cells verbatim, and markdown cells inside python comments with appropriate line breaking. That way, you could run and edit notebooks from outside the browser, and let git track them easily. Ah, what a nice world would that be.
But this is exactly the world we already live in, thanks to jupytext!
https://github.com/mwouts/jupytext