| HN Mirror

Right, that is what I meant about it should continue to sample from fair coins. I don't know that I've seen experiments to see how long that takes, though.

There is also the question of how long you'd leave multiple treatments out there. Presumably, even if there is no difference in outcomes, there can be benefits to having fewer deployed behaviors.

I'm now also curious if there are non-transitive situations. For example, three treatments together that all act fair if all deployed, but for reasons any two of them deployed alone will show a preference. Ideally, of course, treatments should be done such that this can't happen, but mistakes are often made.

Edit: Fully cede that this is likely chasing edges. The motivation for fewer deployed arms is far more compelling than the edge cases.