| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by falcor84 583 days ago
	> It’s like chaining up perceptrons hoping to get more expressive power for free. Isn't that literally the cause of the success of deep learning? It's not quite "free", but as I understand it, the big breakthrough of AlexNet (and much of what came after) was that running a larger CNN on a larger dataset allowed the model to be so much more effective without any big changes in architecture.

1 comments

david2ndaccount 583 days ago

Without a non-linear activation function, chaining perceptrons together is equivalent to one large perceptron.

link

xanderlewis 583 days ago

Yep. falcor84: you’re thinking of the so-called ‘multilayer perceptron’ which is basically an archaic name for a (densely connected?) neural network. I was referring to traditional perceptrons.

link

falcor84 583 days ago

While ReLU is relatively new, AI researchers have been aware of the need for nonlinear activation functions and building multilayer perceptrons with them since the late 1960s, so I had assumed that's what you meant.

link

xanderlewis 583 days ago

It was a deliberately historical example.

link