Hacker News new | ask | show | jobs
by bluefirebrand 748 days ago
> This doesn't sound so much as too much coverage but rather like having your automated tests be coupled to implementation details

Depending on how high coverage you are aiming for, I find it hard to imagine a way to achieve it without inevitably tying the tests to implementation details

1 comments

Decoupling tests from what they test does require a concerted effort, and is a skill that requires practice, but in general not as much as you'd think. Most devs are quite comfortable getting rid of coupling between two non-test components (functions, classes, services, whatever) of their system. The main mental hurdle seems to be treating your automated test code as any other code, capable and worthy of being decoupled.

There will always be some coupling between test and testee (for example, if I change my `double` function from doing `x -> 2 * x` to `x -> (x, x)`, my tests for `double` better fail), but there is a lot of coupling which is unnecessary, and this can often be removed. Like decoupling any two pieces of code, there is no one size fits all solution. There are some common sources of unnecessary coupling, and some rules of thumb to avoid them.

For example, let's say we have a (public) sorting routine `sort` which consists of two (private) phases: `prep` and `finish`. The implementation of `sort` would look like

  function sort(xs):
   prepped_xs = prep(xs)
   return finish(prepped_xs)
Let's look at two ways to write a test suite for `sort`.

The first is quite simple, just chuck in a bunch of unsorted arrays of varying sizes and degrees of unsortedness, and assert that what comes out is sorted:

  assert sort([]) == []
  assert sort([3, 2, 1]) == [1, 2, 3]
  etc
The other way is to make the tests more specific by testing `sort`, `prep`, and `finish` separately. For example, we might mock `prep` and do

  x = [1, 2]
  sort(x)
  mocked_prep.assert_called_with(x)
  // and something similar for finish
and have individual tests for `prep` and `finish`:

  assert is_prepped(prep([1, 2]))
  etc
  assert is_sorted(finish(prepped_xs))
  etc
Now suppose you decide that the code would be a lot more readable if some functionality of `prep` would move to `finish`. Nothing changes in the functionality of `sort` as this is purely a refactoring. When you make this change, the first test suite will pass, but the second will fail and require changes (i.e., it is coupled to implementation details of `sort`), because you've changed what `prep` and `finish` do.

So here, by increasing the scope of your test suite from `prep` and `finish` to `sort`, you've achieved looser coupling between the test suite and what is being tested. This can be applied much more generally: by testing at the boundary of your system (at a higher scope), you achieve looser coupling between your tests and your code. That is, you'll have fewer false failures.

> So here, by increasing the scope of your test suite from `prep` and `finish` to `sort`, you've achieved looser coupling between the test suite and what is being tested

But you won't have very good coverage of "prep" or "finish"

You can of course test those functions by carefully constructing test cases for "sort", but that essentially just re-introduces the coupling to implementation details at a higher level

That is indeed a drawback of this approach, and is problematic if and only if `prep` and `finish` are part of the public interface. If they're not, it's a worthwhile exercise to ask yourself what you want from your test suite, and whether or not individual tests for prep and finish support that goal. For me personally, my ultimate goal for a test suite is to get as close to the equivalence "the tests do not pass <-> pushing this to production would break the product" as humanly possible, as I really don't like spending time "fixing" tests. This leads me to test at the system boundaries, i.e., to leave the lower level tests out when I can. What value does testing stuff that will not affect production in any way bring me?

> but that essentially just re-introduces the coupling to implementation details at a higher level

The definition of coupling that I find useful is the following. A thing f is coupled to a thing g w.r.t. a change d in g if changing g by d requires you to change f. I find that it captures exactly the phenomenon which makes maintenance and extension of brownfield projects so costly.

So, for example, both test suites I gave as an example are coupled to sort with respect to the change "sort sorts stuff in descending order instead of ascending". Making this change requires you change pretty much every assert but the trivial ones, in both test suites.

With respect to the change "move some code from prep to finish" or "get rid of prep and finish entirely and move everything to the body of sort", only the second suite is coupled to sort.

This may not be the definition of coupling that you like to use. If it is, I don't see how the test suite of the first kind including edge cases for prep and finish at the level of sort is still coupled to implementation details.

I strongly do agree with your approach, for the record

I suppose it really comes down to what kind of coverage is important

In your case you seem to take the approach of having coverage on the high level surface area

Mostly I see coverage referring to the actual lines of code and branches

Having high coverage of lines and branches does require having tests that even dig into your private interfaces