|
|
|
|
|
by stiff
4838 days ago
|
|
In comparison to older MLP research, besides the new training algorithm, there is this new insight that the deep structure of the network might be efficient for generating very good encodings of the input variables, like described here: http://en.wikipedia.org/wiki/Autoencoder I am not very familiar with speech recognition, but I think what they talk about here: Instead of factorizing the networks, e.g., into a monophone and a context-dependent part [5], or decomposing them hierarchically [6], CD-DNN-HMMs directly model tied context-dependent states (senones). This had long been considered ineffective, until [1] showed that it works and yields large error reductions for deep networks. might be related to this fact. 20 years ago it wasn't known why would you pick a deep network instead of a shallow one, there was even this famous theorem of Kolmogorow that a lot of people in ML misunderstood, that a network with just one hidden layer can in theory learn any function with arbitrary precision. |
|