Hacker News new | ask | show | jobs
by argonaut 3761 days ago
Agreed. People don't realize that all of the huge algorithmic innovations (LSTMs, Convolutional neural networks, backpropagation) were invented in past neural net booms. I can't think of any novel algorithms of the same impact and ubiquity (e.g. universally considered to be huge algorithmic leaps) that have been invented in this current boom. The current boom started due to GPUs.
2 comments

Something being invented previously doesn't mean that it existed as a matter of engineering practicality; improved performance is some but not all of that. Just describing something in a paper isn't enough to make it have impact, many things described in papers simply don't work as described.

A decade ago I was trying and failing to build multi-layer networks with back-propagation-- it doesn't work so well. More modern, refined, training techniques seem to work much better... and today tools for them are ubiquitous and are known to work (especially with extra CPU thrown at them :) ).

Backpropagation and convolutional neural nets were breakthroughs that were immediately put to use.
The point is that no one could train deep nets 10 years ago. Not just because of computing power, but because of bad initializations, and bad transfer functions, and bad regularization techniques, etc.

These things might seem like "small iterative refinements", but they add up to 100x improvement. Even when you don't consider hardware. And you should consider hardware too, it's also a factor in the advancement of AI.

Also reading through old research, there is a lot of silly ideas along with the good ones. It's only in retrospect that we know this specific set of techniques work, and the rest are garbage. At the time it was far from certain what the future of NNs would look like. To say it was predictable is hindsight bias.

They could. There was a different set of tricks that didn't work as well (greedy pretraining).
Lots of people tried and failed.

Today lots of people-- ones with even less background and putting in less effort-- try and are successful.

This is not a small change, even if it is the product of small changes.

Reconnecting to my original point way up-thread, my point is these "innovations" have not substantially expanded the types of models we are capable of expressing (they have certainly expanded the size of the types of models we're able to train), not nearly to the same degree as backprop/convnets/LSTMs did way back decades ago (this is important because AGI will require several expansions in the types of models we are capable of implementing).
Dropout and deep belief networks are significant recent algorithmic advances that are already widely used.