|
|
|
|
|
by njohnson41
4099 days ago
|
|
I also like how the backpropagation section starts out by immediately talking about how it is really just chain rule application. The backwards-moving pattern of "backpropagation" is really just a side-effect of the derivative chain rule application order, but a lot of intro materials treat backprop as if it is some fancy thing specially-designed for neural nets. I suppose "compute the gradient of this function using basic vector calculus" just isn't sexy enough. I complain mostly because it took me a while to figure out whether backprop was exactly the same as gradient descent, or if there were subtle differences. |
|
https://en.wikipedia.org/wiki/Automatic_differentiation
which is really cool stuff and should be included more often when talking about backprop.