Hacker News new | ask | show | jobs
by pen2l 1494 days ago
edit: Googling reveals nbdime, has this been looked into? - https://nbdime.readthedocs.io/en/latest/

Not OP but I can imagine easily the need for what he's asking.

You'll find a lot of algorithms for data and image processing saved as notebooks these days offered to you. Let's say you make some changes from the provided code and after a handful of changes something is not working right. You might want to diff from where you are back to a working version in hopes that differences that emerge might clue you into where to look for where the problem might be.

As an aside, I want to say Jupyter notebooks (moreso jupyterlab) is sort of a disruptive change to our coding workflows. We've had interpreters for a long time sure, but creating interactive graphs on-the-fly is a godsend, insights come to you in such a workflow that wouldn't otherwise. I hope this catches on, I actually want my shell terminal to become more Jupyter-like. Also, fun fact: did you know you could do real-time collaboration on Juypter notebooks? https://jupyterlab.readthedocs.io/en/stable/user/rtc.html

2 comments

Oh I can totally imagine use-cases too, but I’d love to hear what the OP’s use case actually is. I also agree completely on the disruption that Jupyter brings, and that it has just massive benefits. But when a workflow isn’t giving you everything you want, it’s worth evaluating whether the tools you’re using are the right tools for the job, right?

One example would be that Jupyter is well designed for a lot of prototyping and for single-person scenarios. It’s well designed for sharing and for including notes and narrative with code. It’s just not really designed for multi-user workflows. That’s not a negative in my book, it’s just a fact that makes me reach for a different tool when I need to collaborate.

Also don’t overlook Spyder, which is part of the same ecosystem as Jupyter, they’re usually bundled together, and Spyder gives you the interactive features you want but might better support a production workflow that is multi-user, collaborative, and also more easily diffable.

All that said, it might be awesome if someone builds a Jupyter diff tool that is designed to ignore the output cells!

Hey there - OP here. I haven't used spyder I'll have it check it out.

The primary use case is: I am a researcher in nlp where speed of prototyping is key. I work in an environment where research fragments are primarily jupyter notebooks. So needing to diff notebooks is typically reviewing my own changes when modifying my and others research sketches. Since its helpful to see how code changes.

What really resonates with me is what others have said which is I need to run cells that take 2-6 hours to compute so recomputing cells is annoying... I dont love notebooks for their messy state which cause obvious problems that are very annoying.. and I am not an advocate for notebooks for production for this reason but the flexibility of computing stuff and having that persist and doing downstream prototyping makes notebooks amazing! Markdown and latex in there is also really helpful.

The secondary use case is PRs but... typically reviewing others research code isnt at the granular level of notebook riffs across a few commits so it deosnt come up often.

> https://jupyterlab.readthedocs.io/en/stable/user/rtc.html

Wow! Realtime notebook collaborative editing! This is going to be so cool for teaching (allow students to fill-in part of the code block).

Have you tried this yes? Is the idea to run jupyter on a machine with a public IP and port 8888 open allowing the server to be accessed from multiple people at the same time? Would this work services like `ngrok` that make you personal computer available online?