Hacker News new | ask | show | jobs
by wpietri 2720 days ago
Could you say a little more about this?

I ask because when we're training human to understand things, there are a variety of benefits to separate feature-understanding from the classifiers. In particular, you get gains in flexibility, extendability, and debuggability.

I get why people are happy to take the ConvNet gains and run with them for now. But have you seen any interesting work to get the benefits of separation in the new paradigm? (Or, alternately, is there a reason why those concerns are outmoded?)

1 comments

That's actually closer to how deep learning started. Initially, deep learning mostly consisted of unsupervised (task independent) features with a linear classifier on top. We had to fit an unsupervised model (e.g. autoencoder) layer by layer before using the feature layers in a supervised task.

This was because we didn't understand how to train a deep model end-to-end until later. When we learned how to make that end-to-end training work it tended to perform better because the learned features were task specific.

You can still learn general features in a bunch of ways, in addition to the older method using autoencoders. For one example, multiple supervised heads with auxiliary losses can learn more generalize features.