Hacker News new | ask | show | jobs
by graphene 3483 days ago
They trained on a computer model of the optical circuit, and only did the feed-forward step on the real thing. The rationale for that is that real-life models spend much more time (and energy) in inference mode, so that is the step you'd most want to optimize.

I can't help but think it would be really cool to automatically produce a circuit that would output the gradient of the error of the actual NN, so you could optimize that directly.