|
|
|
|
|
by dkislyuk
1117 days ago
|
|
As another commenter said, viewing a neural network as a computation graph is how all automatic differentiation engines work (particularly reverse-mode where one needs to traverse through all the previous computations to correctly apply the gradient), and there were several libraries predating Tensorflow following this idea. The initial contribution of Tensorflow and PyTorch was more about making the developer interface much cleaner and enabling training on a wider range of hardware by developing a bunch of useful kernels as part of the library. |
|
I don't think we talked about doing any sort of automated diff (in my day we figured out our own derivatives!) but after I made a simple eigendecomp of a matrix of floats, the mathematica folks contributed an example that did eigendecomp of a matrix with symbols (IE, some of the terms weren't 5.7 but "1-x"). Still kind of blows my mind today how much mathematica can do with computation graphs.
IIUC this is the basis of LISP as well.