Hacker News new | ask | show | jobs
by tgeery 3289 days ago
Thank you. This was an amazing explanation. I am new to SVM's but did not make the connection that margin points (observations along the margin of the hyperplane) become your support vectors. This makes a lot more sense.

And if I am following correctly, it would make sense that the final step would then be:

We would maximize the dot product of a new observation with the support vectors to determine its classification (red or blue)

1 comments

During the learning phase of the SVM, you try to find an hyperplane that maximizes the margin.

The decision function of an SVM can be written as:

  f(x) = sign(sum alpha_sv y_sv k(x, sv))
Where sum represents the sum over all support vectors "sv", "y_sv" represents the class of the sample (red=1, blue=-1, for example), "alpha_sv" is the result of the optimization in the learning phase during the learning phase (it is equal to zero for a point that is not a support vector, and is positive otherwise).

The decision function is a sum over all support vectors balanced by the "k" function (that can thus be seen a similarity function between 2 points in your kernel), the y_i will make the term positive or negative depending on the class of the support vector. You take the sign of this sum (1 -> red, -1 -> blue, in our example), and it gives you the predicted class of your sample.