| HN Mirror

  > I do a lot of work in jupyter notebooks.

I don't think I'll have good advice for you if you want to use jupyter notebooks, hopefully someone else will. I *hate* notebooks. I think they are great for reports or for demos (especially when teaching), but I honestly do not get how people use them in research or general programming.

I will use vim and ipython thought. If on a remote machine I'll use tmux (preserve sessions) but local I use Ghostty[0]. I can iterate through code this way without the notebook and am far less likely to get caught up by with out of order executions. I can get all that ad-hoc, persistent memory, auto-updating function benefits with ipython (autoreload). But vim and ipython don't require me to lose my modularity, organization, automated record keeping, and the rest.

If it works for you, keep it! I'm just saying what works for me (any switch will cause some disruption). But I do want to stress that there's tons of other options for keeping things in memory without using jupyter notebooks (pdb is also a fantastic tool!). But also be careful because persistent memory can easily bite you in the ass too. Easy to forget what's still in memory. I'll also add, that having also been the person that maintains our lab's compute systems, I'm wildly annoyed with notebooks and VSCode users leaving their workloads in memory. This is a user thing more than a tool thing but there's a tendency here and it eats up resources that other people need. Just make sure to disconnect when you leave the desk.

  > I want to fork out and test a hypothesis in the background

But my process does greatly help with this! IMO you should be trying to run experiments in parallel. Operating in this style there's no forking, you're just launching another job. I like using a job launcher like slurm when I have multiple machines but just a simple bash script to launch is often more than good enough.

The point is to not fork. You should clone. With changes, I suggest using git branches. But if your code is written to be modular and flexible it is often really quick and easy to add new functions to handle different tasks, add new types of measurements, or whatever.

The two big reasons to write like I do is that

  1) It is (partially) self documenting. You don't have to think about writing down and remembering all your hyperparameters. I'm going to forget and so I need to automate that to prevent this
  2) I'm running experiments! I may be dumb, but I'm not so dumb I think I am not going to change details of my experiments as the project matures.

That's why I say it is about being lazy. I'm writing the way I do because I know that whether it is tomorrow, next week, or 6 months from now, I'm going to need to make changes that are going to make things very different from where they are today. I don't think of it so much as having foresight about the future so much as I'm just frustrated at having to constantly dig myself out of a hole and this makes that a lot easier and lets me get back to the fun exploration stuff faster. It is 100% about having version control over my hypotheses and experiments.

So I'd argue you should move away from notebooks and use other better tools more suited for the job. It'll definitely cause disruption and you're definitely going to be slower at first but find what works for you. The reason people love tools like vim or love working in the cli is because they are modifiable. There's no one tool that works for everyone. I'm not sure there's even a tool that out of the box works for any one person (maybe the original dev?)! But there's a ton of power in having tools which I can adapt to me and the way I work. I can make it help me catch my common mistakes and highlight things I care about. You don't need to spend hours doing this stuff. It develops over time. But go into any workshop and you'll see that everyone has modified the tools for them. We're programmers, we have way more flexibility over customization than people working with physical stuff. Use that to your advantage. And truthfully, you should see how that idea becomes circular here. I'm just designing my code and experiments to be like my tools: environments to be shaped.

[0] https://ghostty.org/