Hacker News new | ask | show | jobs
by RSchaeffer 3161 days ago
IIRC, his "Bayesian interpretation of the noise" actually shows that dropout performs approximate integration over model parameters. As he says, dropout doesn't work because of the noise but despite the noise.

https://youtu.be/3ONLxYeM1Sc?t=19m21s

1 comments

That seems like a strange/unnecessary way to put it because DO is noise.