Hacker News new | ask | show | jobs
by marcosdumay 3121 days ago
> it's basically about creating way more complex models than those used in previous research and feeding them lots of data

Those are instructions for over-fitting. Deep learning neural networks escape from this problem somehow, but it's not a given that other models would escape it too.

3 comments

This is true! Overfitting is definitely one of the biggest problems with deep learning. Some techniques to avoid it have been developed, such as dropout (introducing noise) and early stopping. But in general this is why deep learning requires huge amount of data, a deep learning model will overfit if not given enough data. That is also why (at this time) it only performs well for certain problems where the ratio between available data and problem complexity is high enough.
The traditional way to avoid overfitting is to reduce the number of independent variables, shrink coefficients towards zero, or otherwise limit the complexity of the model.

With deep neural networks the approach is different. Instead of trying to find global maximum (which is too hard, and will also cause the model to be grossly overfit), the algorithm stops much earlier. Such "underfit" models seem to generalize much better.

They mostly escape from that by using huge amount of data and massive computing resources. Deep learning was became feasible because of the huge amount of data companies like Facebook, Google, Apple and others has collected.