| HN Mirror

We need to differentiate between neural networks, which do not have robust theoretical underpinnings, and practical considerations.

DNN is fantastic from a computational standpoint. Its _GEMM all the way down. You get high flop counts and with modern techniques, gradient-based methods find optimums relatively reliably.

But from a theoretical standpoint there are major question marks. Why does dropout work? Why has SGD been so successful? To make the field more rigorous these need to be pounded out. And in the course of it, this will make DNNs more powerful, more generalizable (as Ferenc noted), and more useful. I'll also add that it might help us discover fundamental laws of intelligence.

As evidence of this approach being useful, I'll note that Yann LeCun is openly Bayesian.