|
|
|
|
|
by Componica
600 days ago
|
|
My take during that era was neural nets were considered taboo after the second AI winter of the early 90s. For example, I once proposed a start-up to consider a CNN as an alternative to their handcrafted SVM for detecting retina lesions. The CEO scoffed, telling me neural networks were dead only to acknowledge they were wrong a decade later. Younger people today might not understand, but there was a lot of pushback if you even considered using a neural network during those years. At the time, people knew that multi-layered neural networks had potential, but we couldn’t effectively train them because machines weren't fast enough, and key innovations like ReLU, better weight initializations, and optimizers like Adam didn't exist yet. I remember it taking 2-3 weeks to train a basic OCR model on a desktop pre-GPU. It wasn't until Hinton's 2006 work on Restricted Boltzmann Machines that interest in what we now call deep learning started to grow. |
|
I'm sure there is more detail to unpack here (more than one paragraph, either yours or mine, can do). But as written this isn't accurate.
The key thing missing from "were considered taboo ..." is by whom.
My graduate studies in neural net learning rates (1990-1995) were supported by an NSF grant, part of a larger NSF push. The NeurIPS conferences, then held in Denver, were very well-attended by a pretty broad community during these years. (Nothing like now, of course - I think it maybe drew ~300 people.) A handful of major figures in the academic statistics community would be there -- Leo Breiman of course, but also Rob Tibshirani, Art Owen, Grace Wahba (e.g., https://papers.nips.cc/paper_files/paper/1998/hash/bffc98347...).
So, not taboo. And remember, many of the people in that original tight NeurIPS community (exhibit A, Leo Breiman; or Vladimir Vapnik) were visionaries with enough sophistication to be confident that there was something actually there.
But this was very research'y. The application of ANNs to real problems was not advanced, and a lot of the people trying were tinkerers who were not in touch with what little theory there was. Many of the very good reasons NNs weren't reliably performing well are (correctly) listed in your reply starting with "At the time".
If you can't reliably get decent performance out of a method that has such patchy theoretical guidance, you'll have to look elsewhere to solve your problem. But that's not taboo, that's just pragmatic engineering consensus.