| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by forgot-im-old 732 days ago
	It's not clear that a bunch of cascaded rectified linear functions will every generalize to near 100%. The error floor is at a dangerous level regardless of training. AGI is needed to tackle the final 1%>

1 comments

red75prime 732 days ago

The universal approximation theorem disagrees. The question is how large the network should be and how much training data it needs. And for now it can only be tested experimentally.

link

forgot-im-old 732 days ago

The universal approximation theorem does not apply once you include any realistic training algorithms / stochastic gradient descent. There isn't a learnability guarantee.

link

red75prime 732 days ago

There's no theorem that SGD is insufficient. So, as I said, it's empirical.

link

forgot-im-old 731 days ago

You said it only depends on network size, I'm saying it more likely is impossible regardless of network size due to fundamental limits in training methods.

link