Hacker News new | ask | show | jobs
by forgot-im-old 732 days ago
The universal approximation theorem does not apply once you include any realistic training algorithms / stochastic gradient descent. There isn't a learnability guarantee.
1 comments

There's no theorem that SGD is insufficient. So, as I said, it's empirical.
You said it only depends on network size, I'm saying it more likely is impossible regardless of network size due to fundamental limits in training methods.