Hacker News new | ask | show | jobs
by gwd 2232 days ago
Let me give it a try. Suppose we have 100,000 people in a statistically representative town.

If 1% of people have had COVID-19, then that's 1000 people who have had it, and 99,000 people who haven't.

The test has a sensitivity of 100%, which means all 1000 people who've had it will test positive.

The test has a specificity of 99.9%, which means 98,901 of the 99,000 people who haven't had it will test negative; but that leaves 99 people who haven't had it, but test positive anyway.

That gives us 1099 people who look like they have immunity; but only 91% of those people are actually immune: 9% of the people are false positives.

If instead we have a specificity of 99%, then only 98,010 of the 99,000 people who haven't had it will test negative, leaving 990 people who haven't had it but test positive anyway.

That gives us 1990 people who look like they have immunity; but only 50% of them actually do -- the other 50% are false positives.

1 comments

So if I'm understanding this correctly, with this test.

If you test negative, you are clear, guaranteed, no false negatives.

If you test positive, there is a 10% chance it's a false positive.

I guess my follow up question, does a retest of the positive population make that false positive rate drop to 0.1%, or is the reason for false positive significant to an individual and not random chance?

> If you test positive, there is a 10% chance it's a false positive.

Well, don't misunderstand -- it's got nothing to do with the test per se, but with the probability that you had the disease in the first place.

The test itself has two probabilities:

1. If you've had COVID-19, the probability that it will report positive (sensitivity)

2. If you haven't had COVID-19, the probability that it will report negative (selectivity)

But those probabilities give you a mapping from reality -> test_result. What you want is the reverse of that -- and find the probability from a test_result -> reality. When you do that, you have to factor in the probability that you have the disease in the first place.

If 50% of the population have had COVID-19, then a positive test means a 99.9% probability of having had the virus. If 1% of the population, a positive test means 91% likely you have it. If only 1 in a million people had COVID-19, then the number of false positives would completely overwhelm the number of true positives.

This is sometimes called the "Base rate fallacy": forgetting to factor in the base rate when determining something like this.

It's important for things like, say, systems which automatically detect terrorists at airports. How many travelers at an airport are actually terrorists planning to attack a plane? It's got to be one in hundreds of millions, if not billions. With that low of a base rate, even if you had a system that was 99.999% accurate, the vast majority of people it flagged up would be innocent.

I had the same question about retesting. Here’s a quote from Scott Gottlieb (former FDA commissioner):

“While all of these tests can still generate false positives—a finding that you have the antibodies when you don’t—that risk can be sharply reduced by repeating the test if it comes back positive. The predictive value of two consecutive positive tests is high enough that you can be confident antibodies are present.”

https://www.wsj.com/articles/antibody-knowledge-can-be-power...

> If you test positive, there is a __% chance it's a false positive.

This percentage is based on both the test and the real infection rate.