|
|
|
|
|
by unishark
2244 days ago
|
|
I'd more generally describe the area as first order optimization, including methods like acceleration, automatic differentiation, stochastic approaches. Adam is just one trick for determining a hyperparameter. They are usable everywhere derivative-based optimization is usable. Which certainly means SVM's, though since it's a shallow method you don't need much data to train it, and hence don't need a scalable optimization methods (it would just be unnecessarily slow). But you certainly could do it if you somehow needed to. Here's the first hit on google for "sgd svm': https://scikit-learn.org/stable/modules/generated/sklearn.li... The fact that you can't use first order optimization methods for graphical models is one answer to the question of why everyone doesn't use them. Though for small models there are deep networks which model them and are trained as per usual for neural networks. I think this is still an active research area. |
|