| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by 29athrowaway 750 days ago
	You can also say backpropagation is the chain rule from centuries ago.

3 comments

telotortium 749 days ago

It is a computationally clever application of the chain rule to minimize the amount of computation needed to compute gradients for all parameters in the network.

link

om8 749 days ago

> to minimize the amount of computation

IMO backprop is the most trivial implementation of differentiation in neural networks. Do you know an easier way to compute gradients with larger overhead? If so, please share it.

link

QuadmasterXLII 749 days ago

My first forays into making neural networks used replacement rules to modify an expression tree until all the “D” operators went away, but that takes exponential complexity in network depth if you aren’t careful. Finite differences is linear in number of parameters, as is differentiation by Dual Numbers

link

eli_gottlieb 749 days ago

Backprop is the application of dynamic programming to the chain rule for total derivatives, which sounds trivial only in retrospect.

link

Jensson 749 days ago

You can do forward propagation. Humans typically finds forward easier than backwards.

link

tripzilch 749 days ago

since you asked ... how about Monte Carlo with Gibbs sampling?

link

pmayrgundter 749 days ago

I totally didn't realize this until these comments. Neat!

I went digging in wikipedia.. the Backpropagation article was created in 2005 and yet the mention of association/derivation from the chain rule wasn't mentioned until 2014, through a borrow from the German article

https://en.wikipedia.org/w/index.php?title=Backpropagation&o...

link

elijahbenizzy 749 days ago

Backpropogration is just an application of the chain rule -- cool that we all learned it in high school!

link