Hacker News new | ask | show | jobs
by RyanZAG 4716 days ago
I think the big issues people see in A/B testing is because of a fairly tricky reason: the underlying distribution of the data. The usual ways of estimating how big your sample size are have one huge giraffe of a problem hiding in them: they assume the underlying distribution is normal.

The correct way to estimate your sample size is to use the cumulative distribution function of your underlying distribution. See a brief explanation from Wikipedia here: http://en.wikipedia.org/wiki/Sample_size_determination#By_cu...

Now what's the problem with A/B testing? Most of the stuff we test A/B for is incredibly non-normal. Often 99% of visits do not convert. We're looking at extremely skewed data here. Generally the more skewed the distribution, the more samples we need.

For a very basic understanding of why: consider a very simple distribution with 99.99% of the time you get $0 and 0.01% of the time you get $29 - fairly similar to what we A/B test. Do you think a sample of 1000 or 10000 is going to be anywhere near enough here? Of course not.

1 comments

In statistics there is a "golden rule", that when np > 5 and n(1-p) > 5, then the normal distribution is a good approximation for the binomial distribution. Here n is the number of experiments and p is the conversion rate.

Our A/B testing data results from a Bernoulli experiment, and thus is binomially distributed. So indeed, if we use tests that assume a normal distribution, if we want to approximate the binomial distribution when p is 0.0001, n needs to be roughly 50k.