Hacker News new | ask | show | jobs
by ogrisel 5 hours ago
Paul Werbos did not apply backprop to MLPs as cleanly described in Hinton's paper, but rather to some kind of autoregressive non-linear parametrized functions with a much more specific application scope.

Both papers are direct applications of the chain rule applied to estimate the gradient of a multivariate function.