Hacker News new | ask | show | jobs
by jstandard 2808 days ago
Thanks, this is fascinating.

How is error rate being defined in this case?

I took your explanation to mean: Cancer is common, so the genetic blueprint for Cancer risk factors should also be common. We have a higher number of samples to verify against which increases the detection accuracy.

For less common diseases, we have a smaller pool to verify against, so our blueprint is less accurate. Lower detection accuracy.

1 comments

No, that's not what it is. Let me give an example:

Disease X is very bad and strikes around age 30, so it has been selected against and is very rare. Only 1/10,000 people have the gene that causes it. Disease Y is bad, but it strikes when you're 70 (Alzheimer's let's say), so it is not strongly selected against and is more common, with 1/100 people having it.

The technology 23andMe uses is wrong about 1/10,000 times, just randomly, by chance. This is true for every place in the genome it tests.

Of 10,000 people, only 2 will test positive for disease X while 101 will test positive for disease Y. The 2 people who test positive for disease X will be the 1 person who actually has it, and 1 false positive, so 50% of the people who tested positive are actually at risk. Of the 101 people who test positive for disease Y, 100 will be at risk and 1 will be a false positive, so 99% of people who test positive are actually at risk.

Incidentally, this is why testing populations at very low risk for disease is generally counterproductive. The false positive rate stays the same but as the true positive rate gets very low, one tends to cause more harm than good.

Thanks, the overall explanation makes sense now.

> The technology 23andMe uses is wrong about 1/10,000 times

Is 23andMe's tech considered state-of-the-art? I'm assuming there are ways to reduce both the false positive and false negative rate and I'm curious how other labs/tech stack up against these rates.