Hacker News new | ask | show | jobs
by jimbokun 1261 days ago
> SVM, PCA, kNN, k-means clustering

Are these still relevant in the age of Deep Neural Networks?

5 comments

Yes, there are all kinds of tasks where the appropriate solution is to use a DNN for much of the learning (either directly learning the correlations or as transfer learning from some large-data self-supervised task) and then, once you have the results of that DNN inference, work with these methods - apply PCA for interpreting the resulting vector, or to separate out specific dimensions to expose them for adjustment in some generative task; or perhaps the best way for the final decision is a kNN on top of the DNN output, etc.
It's not in your list but decision trees still outperform DNN on many tabular problems and can be trained faster.
Also boosting.

But yes these algs are the basis of a lot of more modern algorithms.

A deep NN won't do unsupervised clustering for ex, and NNs perform more poorly than simpler models on small datasets

Yes.

Different problems require different solutions.

Sometimes, an NN would be overkill.

And stakeholders in many situations would like insights why the prediction is what it is. NNs are miles behind LogReg in terms of interpretablity.

PCA is a foundational dimension reduction technique, and kNN can be used in conjunction with embeddings.

k-means is still great when you have prior/domain knowledge about the number of groups.

K-means is pretty poor when the clusters are not linearly separated, but it is the basis of a lot of more modern clustering techniques (kernel K-means if you have prior knoweledge, spectral clustering...)