Hacker News new | ask | show | jobs
by gwern 2671 days ago
Yes, but it won't help with other problems like measuring the wrong metric.

For example, the YouTube latency example linked at the bottom was a randomized A/B test ("launched an opt-in to a fraction of our traffic"), but it was measuring per-user latency metrics when the distribution of 'user' had changed radically thanks to the improvements; for this, he would've needed to instead be monitoring some more global long-term effect like user retention or total traffic (then he would've seen a result like 'latency got a lot worse, but we're getting a ton more users and they're coming back much more frequently, so, that's good overall but why is latency up and who are all these new users...? aha!'). You have a Simpson's paradox on the level of metrics here, instead of individuals.