Hacker News new | ask | show | jobs
by ncdr 3484 days ago
No it is not - as number of cores approaches infinity, the validation accuracy will approach zero, due to the lack of locking of shared memory. There is definitely a sweet spot in the number of cores for the original code, but it is not scalable to infinity. Therefore, it cannot utilize any number of cores.
1 comments

Aligned float updates are atomic in all architectures that matter. Also, unsynchronized parameter updates for SGD have actually been studied in [1], where it was shown that they don't affect performance.

In the limit, performance would indeed suffer as all updates would happen in parallel.

[1] Recht, Benjamin, et al. "Hogwild: A lock-free approach to parallelizing stochastic gradient descent." Advances in Neural Information Processing Systems. 2011.

There's another paper describing the "Hogbatch" approach that shows more exactly the effect of adding cores on accuracy: http://www.ece.ubc.ca/~matei/papers/ipdps16.pdf.

The summary would be that accuracy per pass suffers slightly, but since the speedup is close to linear for the first dozen or so cores, each pass is much faster to run. The result is that the wall time to achieve a given level of accuracy is much shorter despite the slightly lower accuracy per pass.