Hacker News new | ask | show | jobs
by jules 4273 days ago
Suppose you have Beta(a1,b2) and Beta(a2,b2) at the current step. The expected conversion rates are:

    M(a,b) = a/(a+b)

    E1 = M(a1,b1)
    E2 = M(a2,b2)
If we stop now the expected conversion rate is E = max(E1,E2).

If we continue for another timestep with option 1 then the question is whether that can make us switch from 1 to 2 or from 2 to 1 or not. If it can't then the expected conversion rate is the same whether or not we execute one more step. Lets assume without loss of generality that option 2 is currently winning, but if option 1 gets another conversion then 1 is winning. So the new expected conversion rate is:

    E' = int(p_1(r)*(r*r + (1-r)*E2)), r=0..1)
where p_1 is the probability density of Beta(a1,b1). All the moments of the beta distribution have a closed form, so E' also has a closed form.

You could generalize this to running it for n more times instead of one more time, you'd get an expression of the form:

   E = int(p_1(r1)*p_2(r2)*polynomial(r1,r2))
I suspect that also has a closed form but I'm not sure at first glance.