| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by evolutioner 3011 days ago

They can actually work together.

Thompson Sampling / Multi-armed bandit algorithms are traffic allocation strategies to use the least amount of traffic to find the optimal variant.

So say you have version A/B/../N variants of a website, you can split the traffic equally or use a a bandit algorithm and find the best performing variant of it.

But that only helps you test N variants. If you want to test for 4 titles and 3 color buttons, 3 layouts and 5 images you already have 4x3x3x5=180 different variations and you can't test them all. Evolutionary algorithms can help you search through a much bigger search space:

First you try 10 variations, and allocate traffic equally or through a bandit algorithm. Then you combine the top performing candidates multiple times to create a new generation and start over again.

Evolution helps you find which candidates to try in a large search space and then a bandits algorithm can help you allocate traffic optimally.

You have to be careful with A/B testing with Bandit algorithms though. If the conversion rate changes over time or if you have visitors that don't always convert instantly you have to take that into account: https://www.chrisstucchio.com/blog/2015/dont_use_bandits.htm...

1 comments

nrub 3011 days ago

The paper linked above does directly address the case of multiple experiments occurring in the same context. They address this with hill-climbing over those 180 different variations. The use of a bayesian linear regression takes place of the exploration found with Thompson sampling.

link

evolutioner 3011 days ago

You're right, the paper linked above is a different way of solving the same problem. In their case they use a model to decide which website variants to show. Their model accounts for independent effects and pairwise dependencies. Evolution allows you to optimize without needing an explicit model.

I don't think they account for potentially changing conversion rates over time or delayed conversions.

Aside from that, I'd be curious to see how these two approaches compare in a real-life situation.

link