Hacker News new | ask | show | jobs
by WanderPanda 1291 days ago
I‘ve recently come to the conclusion that the magic of fully connected neural networks is that there are almost no tricks to reach close to sota. Dense layers + relu + adam = it just works
1 comments

Sorry but this is just wrong, using only fully connected layers would result in pretty bad performance on images, text, audio, etc., or at the very least require much more data to perform well. At least use the right type of architecture for each data modality, then I agree that the basic version won't perform much worse than sota in the real world.
I think part of parent is wrong but part is correct.

There are many rules of thumb that took the last 5+ years to discover but are now quite standard. You are nit picking on fully connected, but if we add dropout, weight initialization, and adaptive learning rate to what they said, then we are fairly close to being able at least get a deep architecture to overfit a toy dataset and be off to the races for then applying it to a larger dataset.

The smart money should be on research on current shortcomings that will become deal breakers when AI is fully pervasive in society. For example, addressing catastrophic forgetting seems to me to be a very profitable research aim.
Maybe I wasn't clear enough but of course I'm not implying that you can reach sota on image classification with fcnns. There are many problems where the input space is not as noisy, redundant and structure bearing as with images.