Hacker News new | ask | show | jobs
by version_five 1604 days ago
This uses a physical system with controllable parameters to compute a forward pass and

> using a differentiable digital model, the gradient of the loss is estimated with respect to the controllable parameters.

So e.g. they have a tunable laser that shifts the spectrum of an encoded input based on a set of parameters, and then they update the parameters based on a gradient computed from a digital simulation of the laser (physics aware model).

When I read the headline I imagined they had implemented back propagation in a physical system

2 comments

Right,

> Here we introduce a hybrid in situ–in silico algorithm, called physics-aware training, that applies backpropagation to train controllable physical systems. Just as deep learning realizes computations with deep neural networks made from layers of mathematical functions, our approach allows us to train deep physical neural networks made from layers of controllable physical systems, even when the physical layers lack any mathematical isomorphism to conventional artificial neural network layers.

To my naive understanding, and please someone correct me if I'm wrong, the point is that they are not controlling the parameters that compute the NN forward pass directly (hence "no mathematical isomorphism to conventional NNs"), but "hyper-parameters" that guide the physical system to do so. For example, rotation angles of mirrors, or distance between filters, instead of intensity values of light. This leads to the non-linear transformations happening in situ, while simpler transformations in the backprop are still computed in-silico.

> When I read the headline I imagined they had implemented back propagation in a physical system

They touch on that by observing you could train a second physical neural network to compute the gradients for the first. So it could all be physical.

> Improvements to PAT could extend the utility of PNNs. For example, PAT’s backward pass could be replaced by a neural network that directly estimates parameter updates for the physical system. Implementing this ‘teacher’ neural network with a PNN would allow subsequent training to be performed without digital assistance.

So you need to use in silico training a at first, but can get rid of it in deployment.