Hacker News new | ask | show | jobs
by sl8r 2899 days ago
Late to the party, but:

> Both the historical and the control bucket used version A of the website, and they are consistent in their 2.0% conversion rate. Version B is different, and it appears to have a different conversion rate of 2.5%. So why should it not have a future conversion rate close to 2.5%?

It's all a matter of degree. You'd model B's rate as closer to 2.5%, but probably not centered around 2.5%. As you observe more data, the prior becomes less important. E.g., with 10k samples as in the original example, if you used Beta(2+1,100-2+1) as your prior, your posterior would be Beta(252+1, 10100-2+1) as your posterior, which is centered at 2.495%. But if you only had 1000 samples (and 25 conversions), you'd get a distro centered at 2.45%. And if you only had 200 samples (and 5 conversions), you'd get a distro centered at 2.33%. Etc.

> Let's replace the website with a 6-sided die. Historically, the probability of throwing a 3 was 1/6. Now you replace your die with a different die and throw it 10,000 times; the 3 comes up 2560 times. If I had to guess how many times the 3 comes up the next 10,000 throws, I certainly would bet that it's closer to 2560 times than to 1667 times.

In the case of a die where you believe any weighting of the faces is equally likely, this would be true. So this may be an appropriate model in this case. But in the case of the website, I don't think the conversion rates are equally likely, even for a new, un-tested site. If the historical conversion rate is 2.0%, and I'm forced to bet on the most likely conversion for a new (never before seen) variant B, I'd much rather bet on a number near 2.0% than a number like 99%.

> Case B: The historical version A of the online shop did not have any influence on the conversion rate during the testing of version B (compare the dice example above). Then both ranges are equally plausible.

This is exactly what I'm claiming is not true. It's not that A influences B, it's that A tells you something about the likely range of A and B (in this specific case of an e-commerce site). (The reason I chose the ranges [2.0%, 2.5%] vs [2.5%, 3.0%] is that if you model B independently, you'd be indifferent between these ranges; but if you use A to inform a prior, you'd prefer [2.0%, 2.5%].)