Go watch Yatin Gal's talk on dropout in neural networks. He shows pretty convincingly that the belief that dropout reduces network overfitting by introducing noise is wrong.
Wait, that can’t be wrong because that is literally what DO does. It is a convex hull regularizer around the network activations using noise. That is also why dropout does not solve susceptibility to adversarial examples: It merely extends the regions that the NN generalizes to outward; but that is limited because high-dimensional spaces are counter-intuitively large and the noise required to cover a descent fraction of the “unmapped” space would completely prevent learning. AFAIK, Yarin Gal merely provides a Bayesian interpretation of the noise.
IIRC, his "Bayesian interpretation of the noise" actually shows that dropout performs approximate integration over model parameters. As he says, dropout doesn't work because of the noise but despite the noise.