Hacker News new | ask | show | jobs
by SkyPuncher 2295 days ago
With all due respect, this comment shows a lack of understanding about how health professionals assess the quality of diagnostics.

My wife is a doctor (and I've learned a lot from her). In med school, they're specifically taught to evaluate diagnostics on their specificity and sensitivity - which essentially covers false positives and false negatives. If you hear a doctor talk about the "accuracy" of a test, it's likely because they're simplifying the concepts.

"Error rate" or "accuracy" is not used at the scientific level in medicine. Partly, for the reason you defined. It doesn't convey enough information about the outcome of the test.

A "99% accurate test" is pretty meaningless without understanding the specificity and/or sensitivity components. In fact, I've seen some headlines where they incorrectly refer to only one component as the "accuracy".

2 comments

The specificity (true positive) and sensitivity (true negative) do not solve the problem I am describing.

If something is rare, it has a low base rate. That even means a test with excellent specificity and sensitivity could still be wrong most of the time.

Decisions on test accuracy simply cannot be made coherently when ignoring the base rate. To make an intuitive example, suppose that one in a thousand people have a disease. A test for the disease has 90% specificity and 100% sensitivity. It will always correctly give a positive result if the person has the disease, and has a 99% chance that a given positive test is valid. Pretty good, much better than most tests.

Now suppose that 1/1000 of people have the disease. A person with a positive result has a 1% chance of not having the disease. If everyone is tested, then 1/1000 people will get true positive results. But, (999/1000 * 0.01) ~ 1% of people will get false positives.

Thus, a given person with a positive result has nearly a 10x chance of it being erroneous compared to it being accurate! As I said, the frequentist techniques that you describe and are taught in medical schools do not help with this.

Yet this is endemic in medicine. This sort of thing is why in a recent meta-study of 54 landmark cancer trials, only six could be replicated. That is frankly terrifying.

I get bored by more esoteric statistics terms in epidemiology, but accuracy has a simple enough mathematical formula: https://www.lexjansen.com/nesug/nesug10/hl/hl07.pdf

(True positives + True Negatives) / number of all tested

Similar concept comes up in measuring accuracy of computerized image segmentation, where you ignore the true negatives

true positive / (true positive + false positive + false negative)

where it is called intersection over Union (IOU).

I can’t ever remember the names, and just rebuild whatever metric I care about in terms of true vs false and positive vs negative.

Applying all this to the real world is tough because of the over fitting problem. Even if you got the test to be 100% accurate in your tested population, it doesn’t mean it won’t be wrong on the next person it tests. Generalization is hard. So doctors have to guess based on their understanding of the tested and untested population and the sensitivity and specificity of the test. You can go meta and give the doctor a sensitivity and specificity also.