Hacker News new | ask | show | jobs
by ltjohnson 5597 days ago
122,000 seemed absurdly large to me so I just looked up the paper you reference [1]. It looks like there is a calculation error in the paper.

The formula they use is n = 16 σ^2 / Δ^2. σ is the standard deviation, and Δ is the size of the difference. Thus in this problem, Δ = 0.05 and their formula gives n = 16 * (0.05 * (1-0.05)) / 0.05^2 = 304. This is much more in line with what you get from using a 2 sample proportion test (with H_a: p_1 =/= p_2), ~440 in each group [2].

But maybe I misunderstand their formula.

[1] http://exp-platform.com/hippo_long.aspx

[2] http://statpages.org/proppowr.html

edit: fixed Greek letters and added final comment.

2 comments

I think you misunderstand what they're doing with the formula. Delta is the size of the difference you want to detect. In this case, they're saying they want to detect a 5% (relative) change in a 5% (absolute) conversion rate. So σ^2 = 0.05 . (1-0.05), but Δ is 5% of 5% or 0.0025 and the denominator needs to be 0.0025^2.

If you go to your reference [2] and enter the numbers 0.05, 80, 0.05, 0.0525, 1.0, you'll see that they come up with a sample size of about 122k in each group (so 244k in both together).

(The figure of 304 or 440 is what you would get if you wanted to detect an absolute change of 5% in the conversion rate: going from 5% to 0% or to 10%.)

You're right, reading the paper that way, they are trying to detect a change in convergence rate from 5% to 5.25%, I was confused by the use of % in two separate contexts in the same sentence. That being said, I think this is not a good argument against A/B testing.

Fair enough, it would take a very large sample (122K is close enough) to detect a change from 5% to 5.25%. Being concerned about a change that small seems really silly unless 0.0025 * N visitors * revenue per user is a big enough number to be concerned with. I contend it won't be unless either

(1) N visitors is very large or

(2) revenue per user is very large.

If (1) is true, then testing on 122K users is not a big deal. If (2) is true you probably want to have a much more targeted approach, like someone doing sales.

Fourteen relative changes of 5% will double your conversion rate. Or halve it. The cost of an A/B test is small enough that if it took, say, a sample size of 1000 then it would be well worth A/B testing changes that might make a 5% relative difference. On the other hand, if it takes a sample of 122k then indeed you might well decide not to bother -- e.g., because it might be impossible. Which is why "it takes 122k rather than 1k to tell with any confidence" is interesting.

(Rough numbers: suppose you get 1000 visitors per day and convert at 5%, and suppose each conversion is worth $10 to you. Then you're bringing in about $180k/year from them, and a relative change of 5% in that is about $9k. Seems worth doing a modestly-sized A/B test for, but if it takes 4 months then you might reasonably decide to spend your effort elsewhere. (Or, of course, not: the actual cost of doing the test is rather small. But a lot can change over 4 months.)

Interesting -- I'll check it out. (My working day has almost finished, so this thread will probably be dead before I get back it.)