Hacker News new | ask | show | jobs
by spiralk 643 days ago
I dislike how Jupyter notebooks have become normalized. Yes, the interactive execution and visuals are nice for more academic workflows where the priority is quick results over code organization. However, when it comes to sharing code with others for the sake of doing reproducible science, jupyter notebooks cause more trouble than they are worth. Using cell based execution with python is so elegant with '# %%' lines in regular .py files (though it requires using VSCode or fiddling with vim plugins which not all scientists want to do I suppose). No .ipynb is necessary, .py files can be version controlled and shared like normal code while sill retaining the ability to use interactively, cell by cell.

Its much easier to organize .py files into a proper python module, and then share and collaborate with others. Instead, groups will collect jumbles of slightly different versions of the same jupyter notebooks that progressively become more complex and less manageable over time. It's not a hypothetical unfortunately, I've seen this happen at major university labs. I'm not blaming anyone because I understand -- the funding is there to do science and not rewrite code to build convenient software libraries. Yet, I can't help but wish jupyter notebooks could be removed from academic workflows.

7 comments

I think there's a fundamental mistunderstanding and mismatch between what you want to do, and what Jupyter notebooks are for. The distinction is between code versus the results.

If the code is the end product, sure, use a python package.

But does your .py with `# %%` in it also store the outputs? If not, why even bring this up? A .py output without the plots tied to the code doesn't meet the basic use case.

If the end product is the plot, I want to see how that plot was generated. And a Jupyter notebook is a much much better artifact than a Python package, unless that Python package hard codes the inputs and execution path like a notebook would.

Over the past 20 years of my career I have run into this divergence of use cases a lot. Software engineers seem to not understand the end goals, how it should be performed, and the learnings of the practitioners that have been generating results for a long time. It's hard to protect data scientists from these inflexible software engineers that see "aha that's code, I know this!" without bothering to understand the actual use case at hand.

Not having the outputs tied into the code is actually preferable if the ultimate goal is reproducible science. Code should be code, documentation should be documentation, and outputs should be outputs. Having multiple copies of important code in non-version controlled files is not a good practice. Having documentation dispersed with questionable organization in unsearchable files is not good a practice. Having outputs without run information and timestamps is not a good practice. Its easy to fall in to those traps with Jupyter notebooks. It might speed up initial set up and experimentation, but I've been working academic labs long enough to see the downstream effects.
Having the outputs recorded alongside specific versions of the code can actually be very valuable.

But since most uses of Jupyter notebooks I've seen don't version control them much at all, it's not as useful in practice often.

Yeah, jupyter notebooks don't guarantee any specifics about versions of code used for that output. In the real world you can expect everyone in the lab including all of the students to be editing jupyter notebooks at whim. The only way to do this would be to have proper version control and of your code, a snapshot of the environment, and to log all this along with the run that generated the output. This is possible with regular python using git, proper log files, etc. Jupyter notebooks seem like an extra roadblock.
Ooh. That's a nice utility funtion that I will write soon. We tend to look at requirements as something we hope the package manager gets right, and then we ignore at runtime, but there are a bunch of errors we could avoid if we verified at runtime. Sometimes when writing a library you have to have different code paths for different versions.

Something like `if check_versions(pandas__gt="2.0.0", pandas__lt="3.0.0"):`

Often the notebook was run on a beefy server with GPUs attached, potentially taking hours/days of compute. It would be senseless to force every viewer of a Jupyter notebook to have the same setup and time just to read through the results and output.
> Not having the outputs tied into the code is actually preferable if the ultimate goal is reproducible science.

What a strange thing to assert, especially as a general overarching truth.

The best reports I have ever seen have matched code and output in the same file. There's never a question of what code generated a plot or a table with a notebook.

With .py files and separate outputs there's far more change for unreproducibke science, it's far messier, and for someone who doesn't appear to respect the organizational capabilities of academic labs, you are condemning them to far more poorly organized outputs.

> Having multiple copies of code

That doesn't have anything to do with notebooks. It's as silly as saying that a Python package is a poor idea because you say somebody repeat code across multiple places.

> non-version controlled files

Notebooks are no less version controllable than .py files.

> outputs with timestamps and run information

Jupyter notebooks are perfect for this, far superior to a directory of cryptically named outputs that need to be strung together in some order

> documentation dispersed with questionable organization

Using separate Python files rather than a notebook means that documentation can never be where it needs to be: next to the output. This is one of the ways that Python files are strictly inferior for generating results.

There are roughly two modes for notebooks: exploration with a REPL, and well-documented reports. The best scientific reports I have ever seen are notebooks (or R Markdown output) that are the full report text plus code plus figures.

> someone who doesn't appear to respect the organizational capabilities of academic labs, you are condemning them to far more poorly organized outputs.

This is not a great way to make your argument, though you are not the not only one here making a personal judgement without even knowing about my background. These are all issues I have seen first hard. With most academic labs being funding limited, the "organizational capabilities of academic labs" seems irrelevant to me. In our field, no one is getting grants to manage code of any kind .py or .ipynb and I suspect its the same at most university labs. It's effort wasted that ultimately does take time away from the actual research that's fundable and publishable. As someone who has been responsible for wrangling people's notebooks in the past, it's enough of a problem that I would encourage to remove all .ipynb.

> That doesn't have anything to do with notebooks. It's as silly as saying that a Python package is a poor idea because you say somebody repeat code across multiple places.

Human factors make jupyter notebooks lead to the problems I have listed. The issues are most apparent with large groups and over long periods of time. Python and other programming languages already solved most of these problems with git. There isn't a tool that is as elegant and scales from individuals to massive organizations.

> There are roughly two modes for notebooks: exploration with a REPL, and well-documented reports. The best scientific reports I have ever seen are notebooks (or R Markdown output) that are the full report text plus code plus figures.

The REPL functionality is handled by .py cell execution, as I’ve mentioned in other comments. It baffles me how the minimal effort saved by not using separate tools -- one for code, one for documentation -- justifies the issues it introduces.

I use jupyter notebooks at work, not so much for academic stuff, but often to help build and show a narrative to folks, including executives (where I have any even remotely technical leadership). It's great for narrative stuff, especially being able to emit PDFs and what not. I've been in a number of meetings where I've got the code up in Jupyter, sharing the screen, and leadership want us to tweak numbers and see the consequences.

It's great for exploring code and data too, especially situations where I'm really trying to feel my way towards a solution. I get to merrily intermingle rich text narrative and code so I explain how I got to where I got to and can walk people through it (I did that with some experimenting with an SMT solver several months ago, meant that people that had no experience with an SMT solver could understand the model I built).

I'd never use it to share code though. If we get to that stage, it's time to export from jupyter (which it natively supports), and then tidy up the code and productionise it. There's no way jupyter should be the deployed thing.

That seems like a reasonable way to use jupyter notebooks since you have an actual plan to move beyond it when necessary. My issue is mostly with the way its misused, often by people who are arguably at the top of the field.
We've seen how this ends because mathematicians have been sharing Mathematica notebooks forever. It's not pretty.

Like you I see the appeal, but they're a usability nightmare beyond a few lines. Part of the problem, I think, is that you can't really incrementally improve them. Who wants to refactor a notebook and deal with all the cell dependency breakage?

So they start off okay and then slowly become terrible until they're either irreplaceable or too terrible to work with and a new one is started.

The same problem exists with spreadsheets. Should we get rid of excel (the single tool that literally runs half the world), and start manually writing markdown tables in text files?

The tool and the tool maker are supposed to serve the user. The user is not supposed to conform to the whims of the tool maker.

Since 94% of business spreadsheets contain errors [0], then probably yes we should get rid of or significantly improve spreadsheets.

Probably the solution is that things like Jupyter notebooks and spreadsheets should be views into some better source of truth rather than the source of truth themselves.

[0] https://phys.org/news/2024-08-business-spreadsheets-critical.... I remember a similar figure from studies a decade or so ago.

The funny thing is I've seen folks try to deploy software to get rid of spreadsheets. It always ends badly to terribly.

Spreadsheets are the nonprogrammers programming / modeling tool in business.

It does presentation, data filtering / sorting, modeling and more.

No AI needed (and you can now plug AI in in some cases).

Sure, but here's the basic problem I think:

Suppose you have some formula that computes a financial metric for your company. Someone you've shared it with drunkenly fat-fingers the formula 3/4 of the way down a long row, and that causes all entries below it to recompute with the wrong formula. Unless the change is really drastic, you may never know it happened.

And this sort of mistake -- basically a typo or a bad mouse movement -- happens daily in every company in the world in some spreadsheet. Often people will notice the mistake, but not with probability 1.

Software engineers have mechanisms to guard against some of these mistakes, and even we have a hard time getting people to take code review or tests seriously. What is the guard in the spreadsheet world?

Have you seen the Excel disaster on genomics? Enough to ban Excel for anything serious except for accounting.
I had not heard of that, but it appears to be this

https://www.nature.com/articles/d41586-021-02211-4

Another issue is that jupyter, pandas, and polars don't take displaying tabular data seriously. Just have a better default table display widget. Look at ipydatagrid, perspective, or buckaroo (my project) for examples of how it could be done better.
I don’t disagree anything you said. Jupytext can be a good tool to bridge some gap, where you pair ipynb to a py script and can then commit the py only (git-ignore all ipynb for your collaborators.)

Also, while many practices out there is questionable, in alternative scenarios where ipynb doesn’t exist, they might have been using something like matlab for example. Eg, in my field (physics), often time there are experimentalists doing some coding. Ipynb can be very enabling for them.

I think a piece of research should be broken down and worked by multiple people to improve the state of the project. Some scientists might be passing you the initial prototype in the form of a notebook, and some others should be refactoring to something more suitable for deployment and archival purpose. Properly funding these roles is important, and is lacking but improving (eg hiring RSE.)

In my field, the most prominent way when ipynb is shared a lot is for training. It’s a great application as that becomes literate programming. In this sense notebook is highly underused as literate programming still hasn’t got mainstream.

I've looked into Jupytext, but ultimately decided to go with pure python. Most of the practical functionality can be replicated, but I do admit there isn't a easy single install tool or guide to replace notebooks at the moment.

I think the notebooks are a fine learning tool to introduce people to programming initially, but I'm afraid it doesn't allow for growth beyond a certain level. You have a good point about funding for those software roles. Perhaps this may not be as big of a concern if there were more software talent in these labs to handle the issues that arise.

In an ideal world that we control everything and/or don’t need to collaborate with others, then whatever tooling one use is actually not that important (and each can choose the best fitting their needs.) So Jupyter+Jupytext is useful in the context of collaboration, where you can’t control your collaborators but want something from them.

While in an ideal world scientists who write softwares should write professionally, the same goes for anything they do, including math and stats used in their research, writing and typesetting and generates publishing quality visualization… That rarely happens because of how the academic world is financed, and the incentives associated with it. I can certainly complain about that all days, but in short a researcher hired by a research university, especially with a tenured track position in the US, will not be successful to get such position, let alone getting tenured, if they had not focused their scarce resource of time to maximize their “research output” (publications, grant, etc.), where software engineering is not part of. (Sorry, sentence too complicated.)

In the end, usability wins. In a Jupyter notebook, you have a much better idea of state between cells, you can iterate much faster, you can write documentation in readable markdown. Often, Jupiter notebooks are more like interactive markdown than they are like python scripts.
> In a Jupyter notebook... > Often, Jupiter notebooks...

Everytime I search my Slack, I have to run two searches because DS can't agree on how to spell the damn thing.

The form factor of Jupyter notebooks seems to fit well with peoples workflows though. Looks like you just wish the internals of Jupyter were better architected.
Imo, the better architected .ipynb is simply .py with '# %%' blocks. It does almost everything a .ipynb can do with the right VSCode extensions. Even interactive visualizations can be sent to a browser window or saved to disk with plotly. Though I do wish '# %%' cell based execution was accessible to more people.

There isn't a single install tool that "just works" for this at the moment. If editors came with more robust support for it by default, I think the notebook format wouldn't be needed at that point and people could use regular python and interactive cell based python more interchangeably. I've seen important code get buried under collections of jupyter notebooks across different users so I have a good reason for this. Notebooks simply dont scale beyond a certain complexity.

The two can coexist. store libraries in python code that is versioned and deployed properly. Notebooks with their data ingest, code, then output should read cleanly. Making the ingest and code readable is the job of library writers. A clean and elegantly coded notebook with inline outputs is a substantively different experience than searching all over the place for the correct browser window that corresponds to the output from a given piece of code.