| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by dkshdkjshdk 1808 days ago
	Why? No one uses SVM as a solver/optimization method (though you do need a solver/optimization method to train a SVM). Same with "modern deep learning" (whatever that may be): just because you need to optimize something doesn't make the field "optimization". Just because I'm using stochastic gradient descent (or some other optimization method) in the course of my work, doesn't mean that I'm working in the field of Optimization.

1 comments

eachro 1808 days ago

To really understand it (primal, dual formulations) you need tools from convex optimization. So it doesn't really feel appropriate to teach it in a standard machine learning class (unless you just toss out the details). In optimization classes, you go through tons of different applications of the methods you learn about: SVM slots in perfectly there. It hits duality, quadratic programming, even gradient descent (Pegasos).

Re deep learning: right, I was bringing up deep learning as a clear example of why you might not want to classify it under optimization. No one considers applications of deep learning to be optimization. However, work on the various optimizers (Adam, adagrad, second order methods, etc) which are all fundamental to doing any deep learning work would be firmly in the field of optimization.

link

dkshdkjshdk 1808 days ago

> To really understand it (primal, dual formulations) you need tools from convex optimization. So it doesn't really feel appropriate to teach it in a standard machine learning class (unless you just toss out the details).

Sure. But to understand it, you probably also need to know a bit about arithmetic, algebra, geometry, etc. Still, you wouldn't say that SVM belong to these fields, even though these fields are probably a requirement if you want to understand SVMs.

> So it doesn't really feel appropriate to teach it in a standard machine learning class (unless you just toss out the details).

If the people you are talking about already had an optimization class (including convex optimization), then it should be appropriate to teach it using those formalisms, no?

Another example: you're not going far in understanding Schroedinger's equation if you don't have the necessary linear algebra bases. Does that make Schroedinger's equation part of linear algebra?

> It hits duality, quadratic programming, even gradient descent (Pegasos).

Sure... then it's a subject of machine learning that is good to refresh your knowledge of optimization and linear algebra, sure. It still feels kinda weird if you're going to introduce people to SVM in the context of an Optimization class (other than possibly as an example of a specific optimization problem, or as an application of specific optimization methods).

> However, work on the various optimizers (Adam, adagrad, second order methods, etc) which are all fundamental to doing any deep learning work would be firmly in the field of optimization.

Exactly. If you're doing that, then you are doing research in Optimization, and not research in "deep learning", as far as I'm concerned. But, let's face it... those types of papers are a minority in the field.

link

srean 1808 days ago

> … you need tools from convex optimization. So it doesn't feel appropriate to teach it in a standard machine learning class.

Those topics were perquisites to taking ML at the grad level when I took them. You either had to have relevant courses in your bag or convince the prof that you could handle it.

link

eachro 1808 days ago

Yea grad level absolutely (pretty much anything can fly at the grad level). Undergrad? Maybe we should teach it b/c of the historical importance it has to the field and how the community developed but I really do think most ML classes would be better off without it b/c of the extra background you'd have to use precious time on. Kernel PCA, kernel regression are better for demonstrating the power of kernels.

I suppose the idea of a maximum separating hyperplane is kind of unique to SVMs and if you just teach SVMs through the primal and leave it at that, you don't need to spend all that much time motivating the dual.

link