Hacker News new | ask | show | jobs
by domk 1722 days ago
In practise, doesn't increasing the coverage highly correlate with increasing the test suite size, therefore proving the effectiveness?

Conversely, I struggle to think how coverage could be increased significantly without increasing the test suite size in reality.

6 comments

It depends. When you tell developers "you must have 100% code coverage" they usually write tests that don't actually validate any functionality and instead get into every if block. Tests are useful when they test edge cases and assert behavior.

I've told this story many times before but at a previous job a senior engineer told me "100% code coverage is useless and you shouldn't go for it" but since he was being dogmatic and not actually thinking about what he was saying he was arguing against something very sensible. I was testing an expert system where everything was large if/else trees encoded in types + configuration. I wanted to make sure I tested all edge cases and activated all of the blocks when they made sense.

I had to fight for that extra coverage and it was, in the end, a massive help.

100% code coverage is still not a good use of everyone's time in most projects and languages. There's a bunch of trivial code that even feels wrong to test. Spend more time cooking up edge case tests that execute some of the branches way more than once.

Think of your code like a heat map. Higher heat on lines that get exercised more often by your tests. It's fine that _some_ code has no color at all, while you some other paths to be bright red in the end, instead of always just going for a uniform orange for everyhing.

> 100% code coverage is still not a good use of everyone's time in most projects and languages

The key here is most. Think about your use case before applying anything you read online.

> It's fine that _some_ code has no color at all, while you some other paths to be bright red in the end, instead of always just going for a uniform orange for everything.

This was the logic tree of a device used for health care work. Cost of failure was high. I had very good coverage of all edge cases that other systems could produce and tested a lot of extremes + a DSL for describing the input state.

The important thing here: an expert system is not most programs and the cost of a failure should really drive what your testing methodology is.

I did write most on purpose, instead of all.

Looks like your case is one where 100% coverage is a good thing and the cost paid for it is absolutely acceptable - nay needed to be paid. And you didn't just go for 100% and stop there because you hit some metric but you actually did the thing that's more important too ("data coverage"). Kudos!

The correlation obviously exists for low levels of coverage.

Above 80% or 90% it becomes a poor measure.

Coverage only indicates which parts of the codebase were touched by the test suite. A big test suite size doesn't mean high coverage.

Coverage can be increased without increasing the test suite by reducing the code base size (within pratical limits obviously)

Personally, I only find coverage as a good indicator of which code still needs to be tested, like forgetting some edge cases or conditional branches.

If it takes you 3 days to write a test that covers a once-in-a-million condition and your service gets 2 requests per day, it will take you about 500,000 days to hit that condition once.

You've increased test coverage, but was it effective? Eh probably could do something more useful with your time.

(yes this is a contrived example, adjust numbers for your situation)

Depends on what the consequence of that failure might be. It could be anything from completely unnoticed to ending somebody’s life, rarity is only one dimension of risk.

    test("when a metric becomes a target, it stops being a good metric", () => {
      runApp(); // look ma, lots of "coverage"!
      assert(true, 'No errors!');
    }); // unfortunately paraphrased from real code
This test will inflate the test coverage, but it is a valid smoke test (assuming that any unhandled exception will cause the test to fail).
You shouldn't write tests like this; there's a high likelihood that the test will be flaky or not representative enough of production, and if the test fails, you often get completely non-actionable error messages.

If you just want to know that your app is broken, it's far better to monitor your live app (or staging environment or deployment pipeline or whatever) since that monitoring infrastructure can then be leveraged to collect other runtime health data in a more granular fashion.

Quality of trests is important, if you’re chasing 100% you’re liable to be writing inconsequential or incorrect tests just to achieve a number.