|
|
|
|
|
by jerrygenser
1291 days ago
|
|
I think part of parent is wrong but part is correct. There are many rules of thumb that took the last 5+ years to discover but are now quite standard. You are nit picking on fully connected, but if we add dropout, weight initialization, and adaptive learning rate to what they said, then we are fairly close to being able at least get a deep architecture to overfit a toy dataset and be off to the races for then applying it to a larger dataset. |
|