|
|
|
|
|
by hinkley
2843 days ago
|
|
I run into this with much larger p and have to explain it to coworkers. Flip the scenario around: the code is fine but the test has a concurrency bug in it. It fails about 20% of the time. You ask someone to fix it. After six green test runs they declare the bug fixed. But then build 8 fails and so does build 10. Why? Because probabilities aren’t additive, they’re multiplicative. You didn’t test 6p > 1, you tested (1 - o)^6 < 0, which will never be true. In this case there was a 26% chance the test was still broken after 6 green tests. You need 11 runs just to get the probability into single digits. (Another consequence of this is that if you have 5 flaky tests the probability of a green build is one in four, and the likelihood of getting multiple consecutive red builds is high enough that it happens every week or two). |
|