| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by dawnofdusk 568 days ago
	Are there any results about the "optimality" of backpropagation? Can one show that it emerges naturally from some Bayesian optimality criterion or a dynamic programming principle? This is a significant advantage that the "free energy principle" people have. For example, let's say instead of gradient descent you want to do a Newton descent. Then maybe there's a better way to compute the needed weight updates besides backprop?

3 comments

roenxi 568 days ago

I'd be willing to be proven wrong, but as a starting point I'd suggest it obviously isn't optimal for what it is being used for. The performance on tasks of AI seems to be quite poor relative to the time spent training. For example, when AIs overtake humans at Baduk it is normal for the AI to have played several orders of magnitude more games than elite human players.

The important thing is backprop does work and so we're just scaling it up to absurd levels to get good results. There is going to be a big step change found sooner or later where training gets a lot better. Maybe there is some sort of threshold we're looking for where a trick only works for models with lots of parameters or something before we stumble on it, but if evolution can do it so will researchers.

link

burning_hamster 567 days ago

> For example, let's say instead of gradient descent you want to do a Newton descent. Then maybe there's a better way to compute the needed weight updates besides backprop?

IIRC, feedback alignment [1] approximates Gauss-Newton minimization. So there is an easier way, that is potentially biologically more plausible, though not necessarily a better way.

[1] https://www.nature.com/articles/ncomms13276#Sec20

link

mrfox321 568 days ago

Second order methods, and their approximations, can be used in weight updating, too.

link