Hacker News new | ask | show | jobs
by jsnell 2006 days ago
Look at the prevalence graph in the article, you can see that the new variant has been gradually taking over the turf from the other ones. This cannot be explained by just the founder effect unlike most other cases, since the prevalence was high to start with. It cannot be explained away by a single super-spreader event, since a single event will just cause a single step-change. This has been a continuous process.

It could be random chance or a selective advantage, but then it comes down to just a modeling exercise. How likely is it that this could happen by chance? And it appears quite unlikely: instead the best way to explain the data is a significantly increased transmission.

1 comments

Is this also taking out potential confounders out of the equation? I believe the currently available data (as opposed to the SAGE minutes, which has the conclusions) is not sufficient to rule that out.
What confounders would you suggest? Elsewhere in the thread you've been suggesting it's a founder effect. It should be plainly obvious why it's not that, nor something you could attribute to a single super-spreader event.
Reading the latest document out, I can't rule out sampling bias, and the methodology used to gather data is also vulnerable to bias (get samples which are PCR-negative for the S gene, but positive to other genes, which is, by their own admission, a poor proxy).

The confidence intervals shown by PHE on potential increased transmissibility are also very wide (not the ones from the NERVTAG minutes, but the new analyses by PHE).

It needs larger sampling (already doing so, I'm sure) and some biological evidence.

How would sampling error lead to the relative prevalence growing from 0% to 60% in a smooth exponential curve over 1-2 months?

My understanding on data gathering is that there have been two data sources: sequencing a 10% sample of the positive results, and using the fortuitous point about the three-target PCR tests showing one of the targets as negative for all such tests. Having two data sources is useful since the results from the sequencing are delayed by weeks.

But up to the point where they have both sets of data, the relative prevalence lines up very neatly. In particular, it cannot be that these results are coming from some other variant with the same 69-70 deletion.

(I don't think it's fair to suggest they implied it was a poor proxy in general. They said it was a poorer proxy the further back in time you go.)

Re: confidence intervals, the data they have from the relative prevalence of the new variant has pretty tight confidence intervals (95% CI: 1.34-1.59 R). That makes sense, because the modeling for that is really simple.

The confidence interval for trying to correlate prevalence of the variant vs growth rates is indeed quite wide. But it makes sense, because that's noisy data.

Yes, more data and more evidence will always be great. How many weeks are you willing to wait for it, before putting in new measures? How will that delay affect the epidemic curve if the findings so far are correct?