Hacker News new | ask | show | jobs
by jonthepirate 2620 days ago
What you really want to do is first disable a test you know is unhealthy to unblock everybody. Then, you fix it. After you've reintroduced it healthy, you can turn it back on.
2 comments

I was talking to someone from Google who works on Bazel things, and he brought an interesting point: flaky tests are asymmetric in that they don't provide much value when they fail (since you don't know if the failure was due to flakiness), but they do provide a lot of value when they pass (because they presumable test something non-trivial.)

With this in mind, what Bazel does when a test is marked flaky is run it several times. This is a simple way of minimizing the effect of flakiness while still getting confidence from green tests.

I dislike rerunning flaky tests. It too often masks genuine failures.
If the effort required to mark-disable/comment out/rm a known-unhealthy test is more than a few seconds beyond the efforts to navigate through a tool like the one you describe, I think the problem is likely in the change control/source control processes being employed. That seems like it should be so easy as to not need an additional tool (unless tests are flaking out so often that even the <1min of overhead to disable them is adding up, in which case I suspect that people are misinterpreting something fundamental about the role of tests in their development processes).
What I've seen in most companies is that when a test goes bad (imagine 10k unit tests, and 1 hits stripe's api sandbox which just went down) the bad test affects everybody who's busy working on their respective feature branches. Everybody wonders how their feature branch broke the stripe integration and you have hundreds of developers trying to diagnose and fix the same broken test.

Our solution allows the someone to know the test failed because its flaking out immediately as soon as it flakes, and provides a 1 click option to instantly disable that test across all feature branches so that everybody else can continue working undisturbed.

Without something like this, you have to: 1) Create a new feature branch 2) Commented out the broken test 3) Wait for it to pass CI 4) Gain approvals as needed 5) Merge the PR back to the master line 6) Message everybody to let them know the test was removed and they should rebase

The process above is sort of the industry standard and this means a giant loss in productivity for everybody on your team and is especially painful for monolith codebases.

Companies where I've worked easily hemorrhage $1m per year on this problem in terms of developer productivity losses if you consider the number of hours wasted per year.

Your first example is an integration test, not a unit test, which should be changed.

Integration tests are nice, but best if ran separately...