Hacker News new | ask | show | jobs
by daveguy 3440 days ago
Obviously it depends on how strong the effect is. The point is you identify the effect and the power. If you get to 1600 people and you are seeing a > 10% effect then sure -- you can stop. As long as you have sufficient power. The point is you must know what the statistical power is, and know where your break points are. You absolutely can not stop just at seeing a 10% effect -- which could happen if you happen to get one in the first 10 samples. That is not dogmatic, it is good statistics.

Take your example. 100 patients, all dead or 100 patients all alive -- you have demonstrated an infinite effect (edit: ok no need to be hyperbolic, 99+% effect), and probably have covered statistical rigor. If your drug is that effective you are probably criminally liable a lot sooner than 100 dead patients. Unfortunately actual medical (and A/B) studies do not mimic make believe scenarios.

1 comments

Totally agree. Stopping at 10% and then claiming the effect size is 10% would be silly. But seeing a giant effect and stopping is totally cool in my book. The bigger the effect difference, the fewer samples you need to judge it. So I think it can be fine to peek and halt. Nothing forces us to use a static number of samples other than an old statistics formula.
The point is, you don't know what that number is unless you do the math. It's not a matter of "judging it". It is a matter of calculating it.

If you "peek and halt" without doing the math. You might as well have a random good result in the first 10 and say "look! Positive results!". You recognize that is ridiculous. So when is peeking and stopping not ridiculous?

A: when the statistical power is sufficient for the observed effect. In the examples -- 1600 or 6000 for a 10% or 5% effect, respectively. And much less for a 20% or 40% effect! -- but you don't know the number required unless you do the math.

Again, this limitation doesn't apply to good bandit approaches. You see big effects quickly and smaller effects more slowly and don't need to do any pre-computation about power at all.

You can even get an economic estimate of the expected amount of value you are leaving on the table by stopping at any point.