|
|
|
|
|
by fogof
1839 days ago
|
|
As a PhD student who sort of burned out on this type of research, I agree that the complexity of Neural Networks as a mathematical construct makes them very difficult to analyze. This might also have to do with Deep learning theory being a subset of learning theory which is subject to "No Free Lunch" [1], which means that you always have to be very careful not to try to prove something that turns out to be impossible. That being said, research on the Kernel regime is one of the very cool ideas, in my opinion, to gain traction in this field in the past few years. To summarize: "If you make a neural network wide enough, it gains the power to control its output on each individual input separately, and will begin to fit its training data perfectly". Of course, the real pleasure is in understanding all the mathematical details of this statement! [1] : https://en.wikipedia.org/wiki/No_free_lunch_theorem |
|
Neural networks "tend to generalize well in the real world". That's a pretty fuzzy statement imo since "real world" is hardly defined but it's still what people experience and it's more useful to provide a more precise model where this works rather than a model where this doesn't work.
Also, there's good theory on deep networks as universal well as theories of wide/shallow networks [1].
[1]: https://arxiv.org/abs/1901.02220