|
|
|
|
|
by abhgh
2726 days ago
|
|
I think you're exactly right about the modularity aspect of DL; in fact I made a similar comment on this page, albeit speaking in terms of basis functions. I have a minor nitpick regarding this point you make: not that anything is likely to beat RBF. Depending on the data, specialized kernels can help immensely. An easy example is sequence classification where something like a string kernel might work really well. Or image classification, where histogram based kernels might prove superior. Note that sometimes you might want to measure how good a kernel is for a problem not by its prediction accuracy alone but also by the number of support vectors it needs - if the final model retains ~100% of the training data as support vectors, it is not a great model in some (subjective) sense since it is memorizing a lot. Depending on the data, you might "beat" the RBF kernel on this aspect too. Regarding the training time, there are some interesting tricks I've come across (but not tried them out yet) -[1], [2]. [1] Ensemble SVMs http://www.jmlr.org/papers/volume15/claesen14a/claesen14a.pd... [2] SVMPath - algorithm to fit the entire path of SVM solutions for every value of the cost parameter, with
essentially the same computational cost as fitting one SVM model. http://www.jmlr.org/papers/volume5/hastie04a/hastie04a.pdf |
|