Fascinating paper but the benchmarks seem incredibly weak. 5000 features for a bag of words model is nothing,these models normally have tens or hundreds of thousands of features.
True, it comparisons do a lot better with more features.
This paper looks to just show the major winning aspect of using CovNets as they do not need many features as the deep net learns its own representations of the training data. It more to show CovNets work on more then just vision.
But architeching the pooling layers IS adding complex to the simple input feature set. Therefore the comparison should be of only state of the art ML.
This paper looks to just show the major winning aspect of using CovNets as they do not need many features as the deep net learns its own representations of the training data. It more to show CovNets work on more then just vision.
But architeching the pooling layers IS adding complex to the simple input feature set. Therefore the comparison should be of only state of the art ML.