|
|
|
|
|
by mytochar
4000 days ago
|
|
Regarding backpropagation and training sections of the NN at different times, there are other training algorithms. Evolutionary training algorithms come to mind, and you could really evolve any section you wanted. You could even train the output of each layer one by one to represent a certain form of input to the future layer. |
|
Yes, there are other methods. Contrastive divergence seems to be king right now - of note is Minimum probability flow learning [1] (of which CD is a special case of). However the flavor of these methods tends to be tuning the weights of the model in such a way to maximize how close the model comes to sharing the probability distribution of the data. One can generally not constraint the model parameters (ie by freezing a layer) and retain the models ability to 'learn' the data distribution.
[1]http://arxiv.org/abs/0906.4779