Hacker News new | ask | show | jobs
by lbeltrame 2005 days ago
Reading the latest document out, I can't rule out sampling bias, and the methodology used to gather data is also vulnerable to bias (get samples which are PCR-negative for the S gene, but positive to other genes, which is, by their own admission, a poor proxy).

The confidence intervals shown by PHE on potential increased transmissibility are also very wide (not the ones from the NERVTAG minutes, but the new analyses by PHE).

It needs larger sampling (already doing so, I'm sure) and some biological evidence.

1 comments

How would sampling error lead to the relative prevalence growing from 0% to 60% in a smooth exponential curve over 1-2 months?

My understanding on data gathering is that there have been two data sources: sequencing a 10% sample of the positive results, and using the fortuitous point about the three-target PCR tests showing one of the targets as negative for all such tests. Having two data sources is useful since the results from the sequencing are delayed by weeks.

But up to the point where they have both sets of data, the relative prevalence lines up very neatly. In particular, it cannot be that these results are coming from some other variant with the same 69-70 deletion.

(I don't think it's fair to suggest they implied it was a poor proxy in general. They said it was a poorer proxy the further back in time you go.)

Re: confidence intervals, the data they have from the relative prevalence of the new variant has pretty tight confidence intervals (95% CI: 1.34-1.59 R). That makes sense, because the modeling for that is really simple.

The confidence interval for trying to correlate prevalence of the variant vs growth rates is indeed quite wide. But it makes sense, because that's noisy data.

Yes, more data and more evidence will always be great. How many weeks are you willing to wait for it, before putting in new measures? How will that delay affect the epidemic curve if the findings so far are correct?