| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by paulddraper 718 days ago

To expand, p value tells you significance (more precisely the likelihood of the effect if there were no underlying difference). But if you observe it over and over again and pay attention to one value, you've subverted the measure.

Thompson/multi-armed bandit optimizes for outcome over the duration of the test, by progressively altering the treatment %. The test runs longer, but yields better outcomes while doing it.

It's objectively a better way to optimize, unless there is time-based overhead to the existence of the A/B test itself. (E.g. maintaining two code paths.)

2 comments

youainti 718 days ago

I just wanted to affirm what you are doing here.

A key point here is that P-Values optimize for detection of effects if you do everything right, which is not common as you point out.

> Thompson/multi-armed bandit optimizes for outcome over the duration of the test.

Exactly.

link

kqr 717 days ago

The p value is the risk of getting an effect specifically due to sampling error, under the assumption of perfectly random sampling with no real effect. It says very little.

In particular, if you aren't doing perfectly random sampling it is meaningless. If you are concerned about other types of error than sampling error it is meaningless.

A significant p-value is nowhere near proof of effect. All it does is suggestively wiggle its eyebrows in the direction of further research.

link

paulddraper 717 days ago

> likelihood of the effect if there were no underlying difference

By "effect" I mean "observed effect"; i.e. how likely are those results, assuming the null hypothesis.

link