Hacker News new | ask | show | jobs
by Loranubi 1253 days ago
I mean you could even view bit flips as a regularization technique like dropout...
1 comments

Yeah I hear it’s common practice now to avoid synchronizing GPU training kernels in order to speed things up, and it has positive regularization benefits and little downside.