Hacker News new | ask | show | jobs
by modeless 1604 days ago
Let me see if I can describe the laser part of the paper correctly. They made a laser pulse consisting of a bunch of different frequencies mixed together. The intensity of each frequency represents a controllable parameter of the system. The pulse was sent through a crystal that performs a complex transformation that mixes all the frequencies together in a nonlinear and noisy way. Then they measure the frequency spectrum of the output. By itself, this system performs computations of a sort, but they are not useful.

To make the computations useful, first they trained a conventional digital neural network to predict the outputs given the input controllable parameters. Then they arbitrarily assigned some of the controllable parameters to be the inputs of the neural network and others were arbitrarily assigned to be the trainable weights. Then they used the crystal to run forward passes on the training data. After each forward pass, they used the trained regular neural network to do the reverse pass and estimate the gradients of the outputs with respect to the weights. With the gradients they update the weights just like a regular neural net.

Although the gradients computed by the neural nets are not a perfect match to the real gradients of the physical system (which are unknown), they don't need to be perfect. Any drift is corrected because the forward pass is always run by the real physical system, and stochastic gradient descent is naturally pretty tolerant of noise and bias.

Since they're just using neural nets to estimate the behavior of the physical system rather than modeling it with physics, they can use literally any physical system and the behavior of the system does not have to be known. The only requirement of the system is that it does a complex nonlinear transformation on a bunch of controllable parameters to produce a bunch of outputs. They also demonstrate using vibrations of a metal plate.

Seems like this method may not lead to huge training speedups since regular neural nets are still involved. But after training, the physical system is all you need to run inference, and that part can be super efficient.

2 comments

> They made a laser pulse consisting of a bunch of different frequencies mixed together

This is how ultra short pulses are made when the waves cancel out appropriately. Now I'm not sure if they are training a network to calculate the filter efficiently for even shorter pulses, or if the purpose is supposed to be an optical neural network, or why not both.

> regular neural net

You used these words several times, and, considered title "physical neural networks", I always wondered if you mean regular like real, or like artificial. If it's artificial, I'm not sure which one of them is "regular" -- LSTM, full, transformers?

I thought it was pretty clear in context that "regular neural net" was a short form of "conventional digital neural network" which I did spell out explicitly the first time.

Any type of artificial neural net could be used. LSTM, transformer, convolutional, fully connected, whatever you want.