Hacker News new | ask | show | jobs
by gwern 3010 days ago
Strictly speaking, it's a big improvement because you're hearing about a phase 1 trial, and predictably so - not surprising at all.

Phase 1 trials typically get little publicity unless they either kill someone or turn in a statistically-significant improvement. (A phase 1 trial in which the controls do exactly as well as the experimentals and no side-effects are observed, which is highly likely for even excellent treatments because of the tiny sample size, will very rarely be written up because it's boring.) And because their sample sizes are always tiny, any statistically-significant improvement will always be an extremely large effect size. If the effect hadn't been improbably large, you would almost certainly not be reading about it now on HN.

And because of this selection bias, the effects you hear about tend to be massively overestimated. This is one of Andrew Gelman's points: the 'statistical significance filter' massively inflates effect sizes, a type M error, which then subsequently regress to the mean of the true smaller effect. This is why you're not supposed to take Phase I trials as meaningful estimates of the effect size, why statisticians emphasize they're supposed to be about safety, and why you don't do power analysis based on the observed results (either post hoc or for designing the next big ones). This is also part of why you hear about so many amazing pilot experiments in animals or humans but then the big followup trials are much more modest or nulls.

1 comments

Interesting point. I was thinking about it in terms of dose, usually phase 1 trials seem to have significantly lower doses, to feel out any issues with safety. But perhaps that doesn't apply in this case, and you are left with the issues you mention about statistical significance, selection bias and regression to the mean.