Hacker News new | ask | show | jobs
by tripletao 1374 days ago
To be clear I don't fault them for studying robustness against doubling time; I fault them for not studying robustness against connectivity of the infection network, since that seems like it would be more important than any of the parameters that they did study. My intuition is that when spread is highly deterministic (e.g. if R0 = 2 and each patient infects exactly two others), it's easy to make inferences about past spread from the present. For example, in that case it really would be near-impossible for a later lineage to outcompete an earlier one.

But we know the spread of SARS-CoV-2 is actually stochastic, with most lineages dying out but a few exploding due to super-spreader events. In that case it's much harder to judge whether a clade is big because it had more generations to grow, or just big because of a few (un)lucky founder effects. In Pekar's epi simulation, that stochasticity is modeled by their connectivity network. I expect that a more overdispersed network (i.e. greater variance in the number of edges at each vertex, keeping the same average) would make non-modal outcomes--like the real pandemic's phylogeny, if it arose from a single introduction--more likely.

1 comments

Their results of the simulations are stochastic. They discuss this in-depth, as it complicates their analysis.

I don't understand what you're trying to say. Everyone agrees that the spread is stochastic. Why are you starting with a hypothetical misinterpreation of an R value to make a deterministic strawman? You think that their simulations were too deterministic because of their connectivity network?

> -like the real pandemic's phylogeny, if it arose from a single introduction-

Propose a phylogeny already. Root this thing.

> You think that their simulations were too deterministic because of their connectivity network?

Yeah, pretty much; and it's what other critics, including well-credentialed mathematical biologists, are saying too. There's a continuum of dispersion, with my perfectly-deterministic strawman at the left extreme but extending to infinity. Their power-law network adds some dispersion, but how do we know it's enough? I believe they chose that distribution because it's been shown to fit some real data (including the spread of HIV) reasonably well; but how do we know it fits the early spread of SARS-CoV-2, in the earliest lineages of the virus with unknown biology, in an unknown group of people with unknown behaviors?

I don't know how to root the phylogeny, and I'm mistrustful of anyone who claims they can based on the limited information available. Anyone who's built and attempted to validate mathematical models knows that sometimes, there's simply not enough information to confidently reach any useful conclusions. Absent validation of the approaches used here (e.g. evidence that they've successfully made predictions in the past in similar situations), I believe that's our situation here.