Hacker News new | ask | show | jobs
by thesz 2947 days ago
>This stochasticity seems important to achieving good results.

No, it is not and may be counter resultive, so to say.

https://arxiv.org/pdf/1605.02026.pdf - page 8, figure 2(b). SGD optimized neural networks stops learning at the accuracy at which whole-dataset methods start!

Also please note that the figure I pointed to is about high energy particles analysis. SGD trained NN cannot even distinguish particles with good precision, let alone discover physics.