Hacker News new | ask | show | jobs
by red75prime 732 days ago
There's no theorem that SGD is insufficient. So, as I said, it's empirical.
1 comments

You said it only depends on network size, I'm saying it more likely is impossible regardless of network size due to fundamental limits in training methods.