| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by leo_pekelis 4210 days ago
	I do agree with you that with sequential testing it is possible to get much slower results. This is actually similar to using a sample size calculator for a classical t-test. If you set your minimum detectible effect (MDE) much smaller than the actual effect size of your A/B test, you will end up waiting for many more visitors than were needed to detect significance. We have looked at many historical A/B tests at Optimizely to determine the range of effect sizes which give the most efficiency for our customers. In fact, we didn’t put out Stats Engine sooner because we wanted to be confident that the speed was comparable to the usual, fixed horizon, t-test. This tuning will be part of an ongoing process to customize results at the customer level. Second, I want to point out that Stats Engine is not a Bayesian test. We do not recompute a posterior from past information after every visitor and use this directly to get significance. Instead such calculations are used as inputs to determine how much information we have compared to a situation of zero effect size. There’s only ‘one lense’ still because we use all this to make and guarantee the usual Frequentist hypothesis testing statements, but now factoring in that an experimenter can look at results at any time.

1 comments

I'd be pretty curious to see how the posteriors of the p-values shake out animated by time.