Hacker News new | ask | show | jobs
by matfil 2017 days ago
One of the most annoying things to deal with in a CI workflow is flaky tests

I guess another controversial opinion: this is a problem with the idea of CI (or at least how we work with it and what we expect to get out of it), rather than the idea of randomized tests.

1 comments

In what way? My personal expectation of CI is predictable, repeatable builds, that give me some assurance the software is working as designed. I also like that it forces everything to be scripted: no "only Bob knows how to build the release file."

Flaky tests are an indicator of poor code: maybe it's your actual code, maybe it's a bug in the test code, or maybe it's external dependency + lack of error handling in the test code but there's a problem somewhere.

The presence of bugs doesn't necessarily indicate useless software. If tests are failing (or flaky...), that's probably something to look at at some point, but that doesn't necessarily mean it's the highest priority to look at. In most places where CI gets deployed -- at least in commercial environments -- there seems to be a goal of making test failures a non-maskable interrupt.

I did admit this was controversial! But it fits in with a more general view that there are a lot of tools which make good servants but poor masters. Auto-builders are a good thing, partly because (as you say) they can help to clarify what is required to make a build, and partly because (especially for dependency-heavy software, which seems to be the norm nowadays) they can help catch things quickly when the dependencies shift beneath you. Making them a hard gate on releases seems a little too close to making the tooling your master, though.

(Somewhat separately, I also worry about CI acts as a hiding place for complexity. Sometimes it reaches that point where nobody knows how to make a build without the CI tool any more. Then local testing and debugging becomes difficult.)

You have a good point. I think the danger comes from the potential for abuse. Once you ship software that has failing tests, you've established a precident and are likely to face pressure to do so again even when the tests are more critical. That's why I'd resist that, at least. My experience is if there's a real problem and later everything in production is on fire, no one cares about the caveats or risks you pointed out.

I do also agree that CI can hide complexity, but that's true of any tool as well as non-CI builds. Compared to a human running things, it's at least somewhat self-documenting due to being scripted.