|
|
|
|
|
by nomailing
3681 days ago
|
|
Could you please elaborate on this. I would really like to know if autoencoders are still useful for classification if I have only labels for a small part of my training data. Is unsupervised pretraining still useful or was it completely replaced by other techniques as the article somehow seems to suggest? |
|
Autoencoders with more than 1 layer are more interesting because you end up doing what is essentially non-linear PCA by projecting your data onto a curved manifold. This famous paper, "Reducing the Dimensionality of Data with Neural Networks" [0], by Hinton shows the improvement in how linearly separable documents become once multi-layer autoencoders are used.
The old argument was that unsupervised pretraining helps get proper weights faster, but this has largely been disproven. However, I do believe AEs assist in semi-supervised learning because they project the initial data into a more useful space. As you can seen in the article I linked the projected data are much more linearly separable.
And as a practical evidence: I used a 5 layer AE in the kaggle black box competition [1] to eventually outrank of team of Hinton's grad students. The problem had a larger unsupervised data set with a small number of labels. Using the autoencoders before the MLP ended up nearly doubling our team's score.
[0] https://www.cs.toronto.edu/~hinton/science.pdf [1] https://www.kaggle.com/c/challenges-in-representation-learni...