Hacker News new | ask | show | jobs
by hinkley 2843 days ago
I run into this with much larger p and have to explain it to coworkers.

Flip the scenario around: the code is fine but the test has a concurrency bug in it. It fails about 20% of the time. You ask someone to fix it. After six green test runs they declare the bug fixed. But then build 8 fails and so does build 10.

Why? Because probabilities aren’t additive, they’re multiplicative. You didn’t test 6p > 1, you tested (1 - o)^6 < 0, which will never be true.

In this case there was a 26% chance the test was still broken after 6 green tests. You need 11 runs just to get the probability into single digits.

(Another consequence of this is that if you have 5 flaky tests the probability of a green build is one in four, and the likelihood of getting multiple consecutive red builds is high enough that it happens every week or two).