|
|
|
|
|
by subtypefiddler
2052 days ago
|
|
Whilst I agree with the general sentiment, in this particular instance it has to do with the depth of network that could be trained efficiently thanks to hardware advances. LeNet was 7 layers deep, Dan's 9, VGG's 13, GoogleNet's 22, etc. There is theory w.r.t to thick networks as well (e.g the link to Gaussian processes require infinite width). Deep makes sense here. |
|
Note that this is hortogonal to sparsedness vs density