Hacker News new | ask | show | jobs
by g0wda 2120 days ago
> I like the reactive notebook concept. It does really help with bugs

Can you say more? An example, maybe?

2 comments

The idea is to keep the amount of global state as low as possible. Pluto.jl creates a dependency graph between cells: If cell A defines foo, and cell B uses foo, then cell B depends on cell A. Whenever cell A is updated, cell B will automatically be re-evaluated.

(edit: I suppose it's not really global state I'm talking about as much as hidden state left over from overwritten or deleted cells)

For example, I typically create tonnes of code cells when I visualize and try to get a sense of some data. The large majority of cells (probably >80%) are then deleted, and whenever I find a trivial bug in some code, I fix the cell where the bug occured.

But now - which cells had I rerun after fixing the bug? And did any of the run cells depend on some variable in a deleted cell? If so, the notebook will no longer be reproducible when I re-run it? It's impossible to keep track of. So when I use Jupyter, I frequently press the "restart kernel and run all" option. But of course, that is slow. So I need to serialize a lot of data, which is troublesome. Pluto completely circumvents that problem.

What happens if you reassign a variable below? Like:

  a = 1
  println(a)
  a = 2
Does it show 1 or 2?

Edit: tested it, it throws an error "Multiple definitions for a: Combine all definitions into a single reactive cell using a `begin ... end` block."

Not sure I like that way of working.

In a more complex example where you actually take a variable, do some operations to it, then reassign it, Pluto.jl encourages you to separate that into multiple cells. The reason is each cell marks a distinct node in the dependency graph. If you prefer to use cells, then the notebook can be smarter about what lines actually need to get re-run and what don't.

A downside to using multiple cells is vertical spacing/visual noise. This is something that the package authors are currently thinking about addressing.

Think of it as working with immutable data, because that's essentially what it is. Which has all the pros and cons of that approach (in my opinion a lot more pros, but YMMV).
Yes, that's the best explanation in the end I think. Maybe I'm too used to reuse variables and that's a bad habit I should work on. For example most of my counters are called i.
You just update the `a=1` cell, changing it to `a=2`. That's the whole point.

Every other cell that depends upon `a` will then automatically update.

Exactly. It requires a change in mindset wrt Jupyter, but it is totally worth it!
It completely fixes a huge number of the gripes in the JupyterCon talk I don't like notebooks by Joel Grus.

The primary complaint there is that notebooks have a disconnect between the state of the program and the display of the cells. By using a completely reactive mode the state is no longer hidden. It's more akin to a spreadsheet than a notebook. A number of other complaints are completely circumvented by using a file format that's simply a pure Julia file with clever comments.

Video: https://www.youtube.com/watch?v=7jiPeIFXb6U

Slides: https://docs.google.com/presentation/d/1n2RlMdmv1p25Xy5thJUh...

Previous discussion: https://news.ycombinator.com/item?id=17856700

Now we just need to port all this over to python... or switch to Julia? Which one would take the least amount of effort?
Switching to Julia. You can just use PyCall to call all of your old code, so py"\paste" and you're using Pluto is like a 1 minute process. Then you can get fancy later.
You'll also save a lot of effort in future endeavors, IMO. Remember, once you've switched to Julia, Python is still only a `using PyCall` and `pyimport` away!