|
|
|
|
|
by blt
1400 days ago
|
|
I think of it as: computing the chain rule in the order such that we never need to compute Jacobians explicitly; only Jacobian-vector products. I also didn't totally grasp its significance until implementing neural networks from matrix/array operations in NumPy. I hope all deep learning courses include this exercise. |
|