| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by ergodic 4886 days ago

Well, it is definitely something but it being the "Breakthrough of the Decade" seems pretty unlikely to me (given my available evidence).

I do not know well other examples beyond case of Automatic Speech Recognition, but since this case caused a lot of noise, I bet it is responsible for a reasonable chunk of the Deep learning "buzz". Here is my take about this.

If you look at papers from Microsoft like Seide et al 2011 and similar papers the reported improvement against state of the art (up to 30%) is really impressive and seems solid. Now, the technique is more or less using a very big multi-layer perceptron (MLP), a technique already established two decades ago (or more). There is some fancy stuff like the deep belief network based initialization, but it does not make big differences. The core of the recipe itself is not very new. What has changed is the scale of data we have available and the size of the models that we can handle.

With this I am not implying that this is not a very interesting discovery. But it is important to bear in mind that the change in the amount of data could also make other 20 year old techniques interesting again. On the other hand, neural networks had a bad name in the last years for understandable reasons. They are a blackbox, or at least less transparent than the statistical methods. This makes them prone to cause the "black box delusion" effect. You hear a new algorithm is in town, it has fancy stuff like remotely resembling human thinking architectures or cool math but you can not completely grasp it guts, then "voila!" suddenly you are overestimating its relevance and scope of applicably. MLPs were hailed as "the" tool for machine learning already once, I think for these same reasons. For me the right position here is a prudent skepticism.

On the other hand, this should also push people to try new/old radical stuff since the rules of the game seem to be changing, it is not a moment to be conservative in ML research :).

4 comments

lrei 4886 days ago

I've heard this argument ever since Norvig's Unreasonable Effectiveness of Data. While having a ton of data available is great, it has its limits. I believe you are overestimating the effectiveness of data (as, imo, Norvig did). And here specifically, it's not the case for the hype:

from the NYT article [1]: "The achievement was particularly impressive because the team decided to enter the contest at the last minute and designed its software with no specific knowledge about how the molecules bind to their targets. The students were also working with a relatively small set of data; neural nets typically perform well only with very large ones."

NNs in general have enjoyed lots of successful practical (commercial) applications in pattern recognition though they were sort of replaced in the "state-of-the-art" by SVMs in many cases until RBMs and DBNs came along. I agree with your caution for skepticism though, only time will tell how good DBNs are.

I think the black box criticism is BS for the most part. In some cases (google's search being a famous example) it might be great to have a human readable and tweakable solution (assuming you have the resources) but for something like recognising handwritten digits from images, not so much.

[1] http://www.nytimes.com/2012/11/24/science/scientists-see-adv...