It's a common interpretation: https://arxiv.org/abs/1706.06859
In particular, this paper neglected to do the obvious thing: ensemble networks trained with dropout. It improves performance over dropout alone.
In particular, this paper neglected to do the obvious thing: ensemble networks trained with dropout. It improves performance over dropout alone.