Very interesting paper, optical circuits have always been interesting option for computing whether it is in the form of plasmonics (surface plasmon + electronics) or linear quantum computing (with similar circuits reported in the manuscript). However, the challenge to translate this into any practical application involves solving a lot of engineering challenges....
Unfortunately they didn't implement the non-linear part in optical form as of yet, though they do at least model a fairly realistic saturable absorber.
They trained on a computer model of the optical circuit, and only did the feed-forward step on the real thing. The rationale for that is that real-life models spend much more time (and energy) in inference mode, so that is the step you'd most want to optimize.
I can't help but think it would be really cool to automatically produce a circuit that would output the gradient of the error of the actual NN, so you could optimize that directly.
I've had a short discussion with a professor at my university about the practical efficiency of circuits. It seems like some tasks are better solved with algorithms and others with circuits.
I think that a proper mix between turing machines and circuits will be important in the future of AI.