Hacker News new | ask | show | jobs
by discardorama 4022 days ago
My understanding: when you're doing standard gradient descent, you push the error down through the layers, modifying the weights at each layer. Now, in "normal" NN training you stop at the input layer; it makes no sense to tweak the error at the input layer, right?

But what if you did the following: flow the error down from the outputs to the layer you're interested in, but don't modify the weights of any of the layers above it; just modify the values of this layer in accordance with the error gradient.

Added later: I think we should wait till @akarpathy comes along and ELI5's it to us.