| We all know .ipynb JSON format is not a great fit for Git. The Jupyter ecosystem has come a long way in the last few years. Solving this really comes down to a few tools - - JupyterLab Git Extension[1] for local diffs (pre-commit diffs) - nbdime[2] / nbdev[3] for resolving .ipynb git merge conflicts - GitHub PR code reviews with ReviewNB[4] - Alternatively, if you don't care about cell outputs then Jupytext[5] to sync .ipynb JSON to markdown Disclaimer: I built ReviewNB. It's a completely bootstrapped business, 5 years in the making and now used by leading DS teams at Meta, AWS, NASA JPL, AirBnB, Lyft, Affirm, AMD, Microsoft & more[6] for Jupyter Notebook code reviews on GitHub / Bitbucket. [1] https://github.com/jupyterlab/jupyterlab-git [2] https://nbdime.readthedocs.io [3] https://nbdev.fast.ai [4] https://www.reviewnb.com [5] https://github.com/mwouts/jupytext [6] https://www.reviewnb.com/#customers |
Notice that using markdown is a possibility for jupytext, but not the only one. More interestingly, you can also store your notebooks as plain python files, whose comments are interpreted as the markdown cells of the notebook.
This is very useful, and not only for version control: if your notebooks are python files they can be executed easily in CI or by third parties just by launching the interpreter. No need even of the jupyterlab dependency.
With some care, you can craft a single python file "foo.py" that can be used at the same time as
1. an executable command-line program (that happens to be written in python)
2. an importable python module
3. a jupyter notebook (to open it you need the jupytext extension of jupyter)
4. the documentation with auto-generated figures, convertible to html or to pdf using "jupyter nbconvert --execute"
5. a regular .ipynb if for some reason you want to distribute the outputs in a re-executable format
For small simple projects, to showcase, describe and illustrate an independent algorithm, we have found this structure invaluable.