Hacker News new | ask | show | jobs
by ergodic 4838 days ago
>DNNs can be thought of a stacked Restricted Boltzmann Machines

Agree, as explained in Hinton et al 2006.

http://www.cs.toronto.edu/~hinton/absps/ncfast.pdf

But this is just for pre-training, as I said. If you look at Seides paper, they pre-train treating the MLP as a DBN and then they train it as a classic MLP with BP. Also using layer-wise BP pre-training does bring performance close to DBN pre-training, with no use of DBNs paradigms at all.

>Their structure and training is very different to traditional MLPs

I insist if we are talking of the same DNNs explained in Microsofts paper, this is not true. If we were to be talking about different DNNs please elaborate I would love to hear about that (seriously, no irony here).

1 comments

There's also the random knockout of neurons, as mentioned in the webinar.
I did not find that on the paper, are you referring to randomly switching off neurons?. I would be surprised if this would not be a technique of the original neural networks wave.