| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by mattj 4360 days ago

These are both good points. I think it's worth calling out here that this post is really about the infrastructure to perform experiments, not necessarily the means of analysis (although the ui you see in the screenshots performs that analysis).

In terms of these issues, we handle them in a few ways. In particular, experiment review catches many of these issues. Think of it like code review for your experiments.

In order to run an experiment, we require you to have an "experiment helper" sign off on your change. This involves reviewing your group sizes, verifying that you have the statistical power you need to test the magnitude of change your hypothesis expects, verifying you interact with the framework correctly etc.. Training to become an experiment helper is generally not very easy, and involves a combination of shadowing existing reviewers, performing enough reviews across the stack, and taking a test to verify you understand potential errors (the test itself being composed of many experiments where we have made mistakes).

Changes to experiments (increasing group sizes, terminating an experiment, modifying an experiment etc.) all require this review.