|
|
|
|
|
by gamegoblin
4348 days ago
|
|
A note on dropout: If your layer size is relatively small (not hundreds or thousands of nodes), dropout is usually detrimental and a more traditional regularization method such as weight-decay is superior. For the size networks Hinton et al are playing with nowadays (with thousands of nodes in a layer), dropout is good, though. |
|