Hacker News new | ask | show | jobs
by freeradical13 2603 days ago
Why did they need machine learning? It seems from Figures 2B-G that there's a clear cut off.

"Moreover, to create a classifier for ME/CFS patients capable of identifying new patients, required for a robust diagnostic tool, we developed a trained kernel Support Vector Machine (SVM), a supervised machine-learning algorithm, using our experimental data. To classify new patients based on whether they fall to the right of the decision boundary, we initially selected the two features with the largest significance: change from the baseline to the plateau and change from the minimum to the plateau for the in-phase components of the impedance. Using these features, a cubic polynomial kernel SVM was able to classify the two populations, although the two features are highly correlated, as shown in Fig.2H."

https://www.pnas.org/content/pnas/early/2019/04/24/190127411...

2 comments

Nothing wrong with an SVM. How else would they create a decision boundary for classifying patients? The choice of the polynomial kernel is interesting, but I don't think it causes any issues given the data.
I see, so basically instead of intuiting a simple threshold (e.g. >X% change), they apply an SVM which is able to discover more accurate thresholds (and error ranges). Do you have any suggested resources on learning more about SVM?

I guess my question comes from the observation that these advanced statistical techniques such as machine learning haven't been around for long and yet medicine has often created decision boundaries, presumably just looking at the data and making a reasonable cutoff. Is all the extra effort in a case like this worth the time investment?

Learning about them: https://en.wikipedia.org/wiki/Support-vector_machine.

That will tell you SVMs are ancient (linear version dates back to 1963), and that what they do here isn’t really machine learning, but something similar to linear regression: just as linear regression finds the best (in some strict mathematical sense) line describing a set of points, this finds the best (in a similar mathematical sense) line splitting two sets of points.

For software, take a look at https://www.csie.ntu.edu.tw/~cjlin/libsvm/. Easy to use, fairly flexible, with a Java applet you can play with.

SVMs are as old school ML as they get. They guarentee the maximum separation at the decision boundary. However it doesn’t scale very well for higher dimensional data. The standard used to be to use some dimensionality reduction technique like PCA to preprocess before feeding it into the SVM.

This is all before deep learning.

Exactly. Perhaps the paper could have given a clearer message if the abstract had characterized SVMs as a quadratic optimization technique instead of as machine learning?
SVMs have smooth decision boundaries unlike neural networks.