| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by MrEldritch 2542 days ago

The short answer: Nonlinearity isn't just important for deep neural networks - nonlinearity is deep neural networks. Without a nonlinear element in between linear layers, the "deep" is meaningless - a "deep linear network" is precisely equivalent in power to a simple, one-layer linear classifier[1]. (Because if all you're doing is a bunch of linear transformations, you can't do anything you couldn't do with a single linear transformation.)

As far as I can tell, your understanding that this is just a linear function is precisely correct - which means it can't do anything that a simple linear classifier can't.

[1] I suspect that the reason this has multiple layers is because of the physical constraints of the system that prevent a single layer from being an arbitrary linear function of the inputs. The light from a specific pixel can only get effectively diffracted so far, so they need to cascade multiple layers to make sure that all the inputs can contribute to all the outputs. It still ends up being equivalent to a single linear transformation.