| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by diffeomorphism 248 days ago

94% accuracy sounds extremely bad, no?

https://www.ssph-journal.org/journals/public-health-reviews/...

> Prevalence estimated (...) 2%–3.5% in primarily non-hospitalized children.

So a fake test always saying "No" would be more accurate at 96.5% accuracy.

4 comments

Zak 248 days ago

The sensitivity of such a test would be 0. This test had a sensitivity of 91% versus 61% for the glass slide count method, which is a large improvement.

The sample size is pretty small here and the control group even smaller. The paper concludes that a larger study is necessary to confirm the result.

link

diffeomorphism 248 days ago

That is exactly why I gave that example. Why does the headline focus on accuracy then?

link

porridgeraisin 248 days ago

To get you to click on it :-)

link

resoluteteeth 248 days ago

If you read the actual link I don't think they're saying that using it as a covid test with some specific threshold of microclots has a 94% accuracy but just that the raw microclot count has a 94% accuracy.

The title on hn which implies that seems to be inaccurate and it's not the original title of the article.

link

diffeomorphism 248 days ago

No, that does not seem to be what they are saying.

> We evaluated the diagnostic power of the device in a cohort of 45 LC patients and 14 healthy pediatric donors. We estimated a 94% accuracy for the microclot count using the devices, significantly higher than the traditional counting of microclots on slides (66% accuracy).

They are comparing the predictive power and using accuracy (instead of sensitivity, recall, F1, etc.). For their method "using the devices", they compute an accuracy of the predictive power, not of the count, of 94%. For the previous method they say the accuracy is 66%.

Basic questions: Is accuracy even a good metric for this? Is 94% a good value or just the difference between bad and very bad?

It might very well be that their improvement is from bad to really good, but the point is that a raw stat of "94% accuracy" is useless without context and so is the headline.

link

resoluteteeth 248 days ago

OK, I looked at the actual paper, and what 94% actually is is the 0.94 area under the curve for the receiver-operating characteristic curve (the plot of the true positive rate (TPR) against the false positive rate (FPR) at each threshold setting) not the accuracy for a specific binary result (e.g. at a specific arbitrary threshold).

See https://www.sciencedirect.com/science/article/pii/S155608641...

> In general, an AUC of 0.5 suggests no discrimination (i.e., ability to diagnose patients with and without the disease or condition based on the test), 0.7 to 0.8 is considered acceptable, 0.8 to 0.9 is considered excellent, and more than 0.9 is considered outstanding

So .94 is actually extremely good.

link

shawabawa3 248 days ago

Accuracy is a nonsense word in this context

Tests have a sensitivity (1 - percentage of false negatives) and specificity (1 - percentage of false positives)

"Accuracy" usually refers to sensitivity. If specificity is near 100% and the test is cheap/fast even low sensitivity can be good

On the other hand you could have sensitivity of 100% but the test could be useless if specificity is low and the condition is rare

link

diffeomorphism 248 days ago

No, it is a well defined term in this context and does not refer to sensitivity.

https://pmc.ncbi.nlm.nih.gov/articles/PMC4614595/#:~:text=Ac...

That is exactly why I gave the trivial example of an "always No" test. It has perfect specificity (zero false positives) and has accuracy corresponding to prevalence. The sensitivity is zero, however, which is the point.

link

resoluteteeth 248 days ago

The paper explains what it actually means, so it's not nonsense. See my other comment https://news.ycombinator.com/item?id=45558941 it's the area under the curve for the receiver-operating characteristic curve and 94% is extremely good.

link

mouse_ 248 days ago

Sample size of 59 also seems worse than useless; I'm no researcher so maybe there's something I'm missing here but, doesn't seem very good.

Junk science?

link

mapontosevenths 248 days ago

It's just an early study, not junk.

The primary conclusion of this research was basically just "this looks like it would be worth doing more research on." Which is a fair conclusion for a study this small.

link