Hacker News new | ask | show | jobs
by akiselev 1075 days ago
I've worked at several startups now where the people in charge of A/B testing couldn't design a single A/B test that returned statistically significant results out of dozens of attempts each. Not. A. Single. One. Even the so called experts. And that's using their own calculations, which were already poorly designed attempts at p hacking.

The worst part is that some of the tests were still used to justify decisions to and by management. SMH

2 comments

Is that the fault of the experiment? Or was it a weak manipulation and no-result was correct (e.g., changing the purchase button to a slightly different shade of green and expecting a higher conversion rate)?
> Is that the fault of the experiment?

No. It's the fault of middle management.

If you are unable to provide your manager reasons for new development, he's going to find someone else to do your job, someone who will give him a report with charts and numbers that he can use to justify and expand his teams operational and headcount budget.

Giving a manager a report that says "there are little to no modifications that we can make at this time to improve UX" is a CLM for him.

IMO, this is still a failure to understand split testing. It is not to discover if changing a button color matters, but to explore the universe of all possible treatments and how they impact your most important business metrics.

It’s a global optimization problem, not a scientific way to understand how a specific change impacts users. Testers that have this mindset tend to be locally constrained and less likely to have bigger wins.

A no difference result in a well designed study is very valuable and can be used to justify decisions. It doesn't matter if we A or B so choose based on cost or the CEOs favorite color or a coin flip.