|
|
|
|
|
by joosters
2580 days ago
|
|
In an old job, we had a frustrating test that passed well over 99 times in 100. It was shrugged off for a very long time until a developer eventually tracked it down to code that was generating a random SSL key pair. If the first byte of the key was 0, faulty code elsewhere would mishandle the key and the test failed. Keeping the randomness in the test was the key factor in tracking down this obscure bug. If the test had been made completely deterministic, the test harness would never have discovered the problem. So although repeatable tests are in most cases a good thing, non-determinism can unearth problems. The trick is how to do this without sucking up huge amounts of bug-tracking time... (Much effort was spent in making the test repeatable during debugging, but of course the crypto code elsewhere was deliberately trying to get as much randomness as it could source...) |
|
There was the logic that generates the SSL key pair, and there is the faulty logic that consumes it. Based on the description, it seems it's an indication of missing test coverage around the faulty code. If, when the faulty code was written, more time were spent on understanding the assumptions the code has made, then maybe the test wouldn't appear in the first place.
This anecdote, however, does bring up a good point: Don't shrug off intermittently failed tests - Dig in and understand the root cause of it.