Hacker News new | ask | show | jobs
by avilay 3008 days ago
What is the advantage of using evolutionary algos in this case over using something like Thompson Sampling or Contextual Multi-Armed Bandit? (http://www.kdd.org/kdd2017/papers/view/an-efficient-bandit-a...)
4 comments

They can actually work together.

Thompson Sampling / Multi-armed bandit algorithms are traffic allocation strategies to use the least amount of traffic to find the optimal variant.

So say you have version A/B/../N variants of a website, you can split the traffic equally or use a a bandit algorithm and find the best performing variant of it.

But that only helps you test N variants. If you want to test for 4 titles and 3 color buttons, 3 layouts and 5 images you already have 4x3x3x5=180 different variations and you can't test them all. Evolutionary algorithms can help you search through a much bigger search space:

First you try 10 variations, and allocate traffic equally or through a bandit algorithm. Then you combine the top performing candidates multiple times to create a new generation and start over again.

Evolution helps you find which candidates to try in a large search space and then a bandits algorithm can help you allocate traffic optimally.

You have to be careful with A/B testing with Bandit algorithms though. If the conversion rate changes over time or if you have visitors that don't always convert instantly you have to take that into account: https://www.chrisstucchio.com/blog/2015/dont_use_bandits.htm...

The paper linked above does directly address the case of multiple experiments occurring in the same context. They address this with hill-climbing over those 180 different variations. The use of a bayesian linear regression takes place of the exploration found with Thompson sampling.
You're right, the paper linked above is a different way of solving the same problem. In their case they use a model to decide which website variants to show. Their model accounts for independent effects and pairwise dependencies. Evolution allows you to optimize without needing an explicit model.

I don't think they account for potentially changing conversion rates over time or delayed conversions.

Aside from that, I'd be curious to see how these two approaches compare in a real-life situation.

They can work in conjunction - In the domain of website optimization, where visitor attributes are often greater predictors of value than website content, a system driven by search space optimization can more easily take into account changes in those variables - eg; time of day, traffic source, device type - and incorporate those inputs to climb multiple 'hills' simultaneously.

The allocation of traffic based on the evolving optimal search space (blue button for visitors from Facebook) can then be driven through an MAB or something similar.

I think, not much. This is an optimization problem to which you can apply any algorithm that "fits" --- no closed-form functional form, interactive feedback --- so bayesian optimization (with GPs), some form of RL, evolutionary algorithms et al.
Thanks everybody for your responses. Food for thought :-)