Hacker News new | ask | show | jobs
by Normal_gaussian 1709 days ago
This is the classic Multi-Armed bandit problem https://en.m.wikipedia.org/wiki/Multi-armed_bandit

I like the graphs and the explanation leads the reader deeper, but it takes the naive approach to exploration without discussing trade-offs.

Tangentially, nearly every self-optimising a/b test I have code reviewed has been significantly flawed.

1 comments

Thanks for pointing this out! I update the post to note that this is a Multi-Armed Bandit problem, and linked to this comment in the updates section.