Hacker News new | ask | show | jobs
by imh 3924 days ago
Wouldn't this require larger datasets? That isn't always an option. I'm imagining that a smaller, more computationally efficient network could learn nearly as well with fewer data points given these heavily engineered features. Is that off base?
1 comments

Basically, no. See http://karpathy.github.io/2015/05/21/rnn-effectiveness/

He gets pretty amazing results with a corpus size around 10M.

But that takes ages to train!
So something like Jason Weston's state-of-the-art attention-NN based sentence summarizer took ~4 days to train.

You'd easily spend that time doing manual feature engineering just to build a baseline system.