| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by sdenton4 38 days ago

Taking a quick look at the paper...

Their claim isn't that the brain uses gradient descent, but that the direction of updates has (on average) positive inner product with the gradient. I expect this would also be true for (say) simulated annealing, yet we don't say that simulated annealing is gradient descent.

There's also a discussion of loss functions and how they relate to the update missing - as far as I know, there's still no great notion of how the brain picks a global loss function, and no mechanism for backprop. In this paper, looking at a specific learning task you can define a loss function extrinsically allowing us to talk about the gradient, but how that relates to things happening in the brain is a big big mystery.

1 comments

hellohello2 35 days ago

Why would this be true for simulated annealing?

link

sdenton4 34 days ago

Because it improves the loss!

The gradient is the direction in which loss improves the fastest. Moving in a direction with a positive dot product with the gradient just means that you're (locally) improving the loss.

link