| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by AndrewGYork 2537 days ago
	I asked this on Twitter, but maybe folks here can answer better: how important is nonlinearity for deep neural networks? This method's output seems to be a linear function of its (complex) input. Does that put important bounds on performance? https://mobile.twitter.com/AndrewGYork/status/10228414045888... https://news.ycombinator.com/item?id=17698135

2 comments

MrEldritch 2537 days ago

The short answer: Nonlinearity isn't just important for deep neural networks - nonlinearity is deep neural networks. Without a nonlinear element in between linear layers, the "deep" is meaningless - a "deep linear network" is precisely equivalent in power to a simple, one-layer linear classifier[1]. (Because if all you're doing is a bunch of linear transformations, you can't do anything you couldn't do with a single linear transformation.)

As far as I can tell, your understanding that this is just a linear function is precisely correct - which means it can't do anything that a simple linear classifier can't.

[1] I suspect that the reason this has multiple layers is because of the physical constraints of the system that prevent a single layer from being an arbitrary linear function of the inputs. The light from a specific pixel can only get effectively diffracted so far, so they need to cascade multiple layers to make sure that all the inputs can contribute to all the outputs. It still ends up being equivalent to a single linear transformation.

link

londons_explore 2537 days ago

Simple examples can be constructed showing that nonlinearity is required for certain problems.

There do exist non-linear optical components, so I assume that could be used for a piece of followup work...

link

MrEldritch 2537 days ago

True, but all the nonlinear optical effects I'm aware of only really start to matter at very high intensities - so wouldn't really be applicable to the kinds of scenarios they envision, like directly feeding it images seen from ambient light.

link

AstralStorm 2536 days ago

Uhm, speed of light differences in a modified crystal lattice are constant nonlinearities reasonable to produce. They do not need high intensity light, but they would need additional circuitry for scaling. Plus the network would have to work on phase angle and not magnitude. Mostly Kerr effect (high voltage) and cross wave polarization (e.g. given Pockel's cell) are useful there.

link