|
|
|
|
|
by IshKebab
970 days ago
|
|
No it's because shallow networks can't express complex functions. If you think about it the shallowest network is pretty much a lookup table. They can theoretically model any function, but the number of parameters needed means in practice they can't. Deep networks can learn much more complex functions for the same number of parameters. |
|
Even theoretically, no they can't. They can theoretically model any continuos function.
Plus, even for continuous functions, the theorem only proves that, for any function, there exists some NN that approximates it to arbitrary precision. It is not known whether there is some base NN + finite training set that could be used to arrive at that target NN using some algorithm in a finite number of steps.