|
|
|
|
|
by dkislyuk
1118 days ago
|
|
The one distinction I would add with neural networks is that it's not just a recursive tree traversal that one would get when evaluating an arithmetic statement, but an actual graph: a computation node can have gradients from multiple sources (e.g. if a skip connection is added), so each node needs to keep accumulated state around that can be updated by arbitrary callers. Of course, optimized autograd / autodiff is more parallelized than node-based message passing, but it's a useful model to start with. |
|
What you're describing with node-based message passing sounds much more like a petri net, or other agent-based discrete event modelling system. Which is another powerful mental paradigm, but challenging to reason about.