| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by apathy 3440 days ago

"Wrong" isn't the word we're looking for here, I don't think. But your above example is bullshit -- nobody puts 1000 patients at risk in a phase I (safety) trial, and if the dose isn't reasonably well calibrated by the phase III study you're describing above, someone's going to jail. In Phase II we will often have stopping rules for exactly this reason, just in case the sampling was biased in the small Phase I sample.

Above there are a number of things to notice:

1) The phasing approximates Thompson sampling to a degree, in that large late-phase trials MUST follow smaller early phase trials. Nobody is going to waste patients on SuperMab (look it up).

2) The endpoints are hard, fast, and pre-specified:

IFF we have N adverse events in M patients, we shut down the trial for toxicity.

IFF we have X or more complete responses in Y patients, we shut down the trial because it would be unethical to deprive the control arm.

IFF we have Z or fewer responses in the treatment arm, given our ultimate accrual goal (total sample size), it will be impossible to conclude (using the test we have selected and preregistered) that the new drug isn't WORSE than the standard, so we'll shut it down for futility. Those patients will be better served by another trial.

You are massively oversimplifying a well-understood problem. Decision theory is a thing, and it's been a thing for 100 years. Instead of lighting your strawman on fire, how about reframing it?

Stopping isn't "always" wrong, but stopping because you've managed to hit some extremal value is pretty much always biased. The "Winner's curse", regression to the mean, all of these things happen because people forget about sampling variability. It's also why point estimates (even test statistics) rather than posterior distributions are misleading. If you're going to stop at an uncertain time or for unspecified reasons, you need to include the "slop" in your estimates.

"We estimate that the new page is 2x (95% CI, 1.0001x-10x) more likely to result in a conversion"... hey, you stopped early and at least you're being honest about it... but if we leave out the uncertainty then it's just misleading.

All of the above is taken into account when designing trials because not only do we not like killing people, we don't like going to jail for stupid avoidable mistakes.

2 comments

Angostura 3440 days ago

> But your above example is bullshit — nobody puts 1000 patients at risk in a phase I (safety) trial

It isn't bullshit - at least, not for that reason. It was a thought experiment, using an extreme example to test your assertions.

link

michaelmrose 3440 days ago

It is an example that tends to bring in lots of irrelevant detail and ethics that distracts from the actual point

link

tedsanders 3440 days ago

My point is that you can still extract useful information when your stop is dynamic rather than static. One typical scenario is when your effect size ends up being larger than you originally guessed. There's little reason to continue if the difference becomes obvious.

In the future, I would appreciate it if you steelmanned my comments or asked for clarification instead of insulting me. It hurt my feelings. I wish I had written a better comment that hadn't incited such a reaction from you. Best wishes.

link

apathy 3439 days ago

You are right, I shot from the hip. Sorry about that.

I also have noprocrast set in my profile so I couldn't go back and edit it (something I thought about doing). I probably would have toned it down if I hadn't requested that Hacker News kick me off after 15 minutes.

Your line of discussion is productive. It's just important that people understand the difference between degrees of belief and degrees of evidence from a specific study and never confuse the two. Trouble is, lots of folks confuse them, and lots of other folks prey on that confusion.

Anyways, sorry for being a jerk.

link

tedsanders 3439 days ago

No worries and thanks for the apology. I apologize for my own comments in this thread, which were lower than the quality I aspire to. I had pulled an all-nighter for work and was sitting grumpily with my phone at an airport.

link

ted_dunning 3440 days ago

Your intuition about effects bigger than you expected is just on target.

But it applies at all scales of effect. Stop when you have a big enough effect or have high confidence that you won't care.

link