|
|
|
|
|
by xcthulhu
4452 days ago
|
|
The Universal Approximation Theorem[1] asserts that you only ever need one hidden layer, which at least asserts that "an (approximate) simplification exists". But I can't say off the top of my head how you'd collapse an ANN just two hidden layers into 1. It's not obvious how sigmoid functions compose, but I suppose I should give it more thought... ----------
1: http://en.wikipedia.org/wiki/Universal_approximation_theorem |
|
UAT says that a solution exists, but it doesn't put a limit on the number of nodes required, so it would have you doing an optimization in a space that is not just large, but of arbitrary finite dimension. It can be pretty nonconstructive (in the sense of proving "there exists" without showing how to find something) insofar as it's often non-trivial to get convergence to a working solution in reasonable time.
As for how sigmoids compose, imagine how bell-shaped curves would compose, just as you can make a painting out of bell-shaped "points" if allowed arbitrary precision/steepness. Now, the difference of two sigmoids can be bell-shaped, e.g. http://www.wolframalpha.com/input/?i=plot+y+%3D+1%2F%281%2Be... . I don't know how much this means in practice, but it establishes the possibility.