Hacker News new | ask | show | jobs
by timr 1348 days ago
> That is incorrect. The title is completely wrong....Look at the confidence interval. This study was both consistent with a 36% reduction in deaths or a 16% increase in deaths. It's just a really wide range. All we can say about this study is that it didn't gather enough data to identify the size of the effect, not that there wasn't an effect.

In a randomized controlled trial you either find a significant difference in your metrics, or you don't. There's no other option. In all cases where you don't find a significant difference, the problem is that the confidence interval is too wide for whatever difference seems to exist. Your argument here is a fallacy (i.e. "you just didn't do a big enough sample!") which is a variant of my personal favorite: "it would have worked if you'd done X, Y or Z!"

There's always another X, Y, or Z. The negative study is always too small for the people who believe in the thing it's testing. As a supporter of some intervention, the onus is therefore on you to prove your claim in a demonstrated scenario, not on everyone else to disprove it in all scenarios. Could it be true that colonoscopies have some significant benefit to mortality smaller than detectable by a 80,000-person RCT? Sure. But that doesn't make the headline wrong.

This study didn't find a mortality benefit. Arguing that there's some theoretical other study that might find a benefit isn't relevant.

1 comments

> In a randomized controlled trial you either find a significant difference in your metrics, or you don't. There's no other option.

This is a poor way of thinking about statistics. Whether you reject or not a sharp null hypothesis doesn't give you much information (See for example: https://www.gwern.net/Everything). Failing to reject in particular, can be compatible with a wide range of effects.

>In all cases where you don't find a significant difference, the problem is that the confidence interval is too wide for whatever difference seems to exist.

With enough data, there could totally have been a tight range around no effect or a small effect. This is not what we got here though.

Also note that other variables such as cancer risk came out significant, so while this study doesn't provide much inductive evidence around cancer death, we do get some deductive evidence based on the known link between cancer and death. Not to mention that cancer and cancer treatments are not fun even when they don't kill you.

> With enough data, there could totally have been a tight range around no effect or a small effect. This is not what we got here though.

What the trial showed was a small effect with a wide uncertainty on a big sample. We cannot distinguish this from zero.

Again, could the observed effect be significant with a larger trial? Sure. But that's always true for a negative result. The objection carries no information.

>could the observed effect be significant with a larger trial? Sure. But that's always true for a negative result.

Sure, this is true, it's one of the reasons why results being significant or not is not very relevant. At some point you want to move towards whether the effect size is in a clinically relevant range or not.

>The objection carries no information.

Inasmuch as something like a confidence interval provides an idea of the range of the effect size, more data does carry more information. I know it's complicated to do this analysis properly with prediction intervals and such, but you have no choice if you want to be able to make good decisions with your data. A wide range estimate that doesn't allow you to make good clinical decisions is not useful.

For clinical purposes, I would even have been more confortable treating an significant but small effect in support of the "let's not test" scenario, than this wide range where the effect could be large and positive or negative on the other side and we just don't know. Significant doesn't automatically mean "do the test" and vice versa. Effect size matters! A non-significant result because of a wide interval just doesn't tell you much useful information.

> Sure, this is true, it's one of the reasons why results being significant or not is not very relevant.

No. Significance is the only thing that matters here. If you don't have a significant result, you don't have a result. Making up stories about how the results coulda-woulda-shoulda been significant if only the study was different somehow is fine for bedtime or planning the next study, but absolutely irrelevant to interpreting the clinical trial in front of you.

The CI here is not actually that wide; I was being colloquial. It's an 80,000 person trial, with 40,000 per arm. The absolute observed difference in colo-rectal mortality between the two arms was 0.03%. The per-protocol analysis (just those people who got tested) was a difference of 0.15%.

That latter figure is the best possible argument for colonoscopy, and no matter how you look at it, it's just not a big difference. Even if you ran a huge trial to get a significant result at these effect sizes, you're still talking about a difference of 15 people per 10,000 (at best) screened. That's a lot of pain and expense for very little gain.