Hacker News new | ask | show | jobs
by nicholast 1573 days ago
They cited some of LeCun's solo work from the 80's on related matters, note that he also collaborated with Geoffrey Hinton on a related paper from NIPS 89 in which they used random perturbations to derive a gradient signal. One of the limitations of that approach was that it required each layer to have fewer units than the preceding one.

Gemini: Gradient Estimation Through Matrix Inversion After Noise Injection Yann Le Cun and Conrad C. Galland and Geoffrey E. Hinton https://proceedings.neurips.cc/paper/1988/file/a0a080f42e6f1...

I cited this in a recent paper because it was representative of different cases for stochastic injections in neural networks. Interesting to see that similar lines of inquiry are continuing to this day.