Hacker News new | ask | show | jobs
by jakobnissen 2121 days ago
The idea is to keep the amount of global state as low as possible. Pluto.jl creates a dependency graph between cells: If cell A defines foo, and cell B uses foo, then cell B depends on cell A. Whenever cell A is updated, cell B will automatically be re-evaluated.

(edit: I suppose it's not really global state I'm talking about as much as hidden state left over from overwritten or deleted cells)

For example, I typically create tonnes of code cells when I visualize and try to get a sense of some data. The large majority of cells (probably >80%) are then deleted, and whenever I find a trivial bug in some code, I fix the cell where the bug occured.

But now - which cells had I rerun after fixing the bug? And did any of the run cells depend on some variable in a deleted cell? If so, the notebook will no longer be reproducible when I re-run it? It's impossible to keep track of. So when I use Jupyter, I frequently press the "restart kernel and run all" option. But of course, that is slow. So I need to serialize a lot of data, which is troublesome. Pluto completely circumvents that problem.

1 comments

What happens if you reassign a variable below? Like:

  a = 1
  println(a)
  a = 2
Does it show 1 or 2?

Edit: tested it, it throws an error "Multiple definitions for a: Combine all definitions into a single reactive cell using a `begin ... end` block."

Not sure I like that way of working.

In a more complex example where you actually take a variable, do some operations to it, then reassign it, Pluto.jl encourages you to separate that into multiple cells. The reason is each cell marks a distinct node in the dependency graph. If you prefer to use cells, then the notebook can be smarter about what lines actually need to get re-run and what don't.

A downside to using multiple cells is vertical spacing/visual noise. This is something that the package authors are currently thinking about addressing.

Think of it as working with immutable data, because that's essentially what it is. Which has all the pros and cons of that approach (in my opinion a lot more pros, but YMMV).
Yes, that's the best explanation in the end I think. Maybe I'm too used to reuse variables and that's a bad habit I should work on. For example most of my counters are called i.
You just update the `a=1` cell, changing it to `a=2`. That's the whole point.

Every other cell that depends upon `a` will then automatically update.

Exactly. It requires a change in mindset wrt Jupyter, but it is totally worth it!