Hacker News new | ask | show | jobs
by btilly 5803 days ago
People blindly stopping at 0.05 is doubly worrying given that people tend to stop an A/B test as soon as it shows significance. That gives them multiple chances to be wrong. Furthermore if you are getting close to significance very fast, then strong significance is close behind, so why not wait?

That said, if a test has been running for a while and you don't have an answer, it can run for a looong time before it finishes. In my A/B testing tutorial I explored that starting at http://www.elem.com/~btilly/effective-ab-testing/#slide59 (just use the arrow keys to move forwards and backwards through those slides). I found that depending on whether random fluctuations took you in the same direction as the underlying bias or the opposite, there tends to be an order of magnitude difference in how long the test takes to run. Furthermore whichever is leading after many observations is usually really better, and in the worst case is overwhelmingly likely to not be much worse. Therefore there are times when it really is better to declare an answer and move on.

If you wish to formalize this, you could use the strategy used by some medical trials where they decide in advance what confidence levels will cause them to cut off early after 100 trials, 1000, trials, 10,000 trials, or to go to (say) 50,000 trials. And then they arrange that the sum of the odds that they make an early mistake are below some acceptable threshold.