| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by karpathy 4962 days ago
	Closer, but still no :) Geoff Hinton proposed contrastive divergence training for Restricted Boltzmann Machines in his 2006 science paper. CD does not apply outside of RBMs though, and most of these nets in the article here are not in fact RBMs. The paper did spark a lot of interest in the field though. These are all neural nets (with some bells and whistles in some cases like tied weights, pooling units, etc) trained exactly as they were trained before using stochastic gradient descent or LBFGS. We did come up with a lot of tricks for making SGD work though, like momentum terms, clamping of weights during learning, dropout, unsupervised pretraining, etc., but in large part it's just a lot more compute power. These networks just turned out to work very well when you have a LOT of (fairly homogeneous) data and can afford to scale them up computationally. And that's pretty awesome, looks like we have a powerful hammer and there are plenty of nails lying around :)

3 comments

gdahl 4962 days ago

That is not entirely accurate. The Science paper described how to (pre)train a deep belief net by training a sequence of RBMs. Contrastive divergence for RBM training (and more generally products of experts) was described in 2002 in "Training Products of Experts by Minimizing Contrastive Divergence" http://www.cs.toronto.edu/~hinton/absps/nccd.pdf

link

karpathy 4962 days ago

doh, not very carefully worded now that I'm re-reading my answer, you're right of course. Well, at least we're slowly converging on the right answer over several comments :)

link

wookietrader 4962 days ago

What exactly is wrong what I wrote? I did not say that all nets nowadays would be trained by RBMs (in the contrary, I said quite the opposite, that new algorithms had been developed). I just said that they were part of the breakthrough.

link

dave_sullivan 4962 days ago

What are your thoughts re: LBFGS vs HF as applied to FF networks? I've been using HF for RNNs and have been having very good results, but I haven't yet tried it on FF networks and wonder if I'd see a benefit compared to SGD with the bells and whistles or even something like LBFGS.

link