Hacker News new | ask | show | jobs
by krisoft 2554 days ago
They haven't got erased, but more like subsumed? If you use dropout to train your model that is basically equivalent with using an ensemble of deep neural networks.
1 comments

That is not even close to the same thing.

If you train an ensemble of models with random dropout, you have an ensemble. Models trained with dropout will still have significant variation from run to run.

> That is not even close to the same thing.

It's a common interpretation: https://arxiv.org/abs/1706.06859

There may be a paper on it, but it’s not a common view.

In particular, this paper neglected to do the obvious thing: ensemble networks trained with dropout. It improves performance over dropout alone.