Hacker News new | ask | show | jobs
by chriswarbo 102 days ago
I agree. The same can be said for testing too: their main purpose is to find mistakes (with secondary benefits of documenting, etc.). Whenever I see my tests fail, I'm happy that they caught a problem in my understanding (manifested either as a bug in my implementation, or a bug in my test statement).
3 comments

This ultimately is what shapes my view of what a good test is vs a bad test.

An issue I have with a lot of unit tests is they are too strongly coupled to the implementation. What that means is any change to the implementation ultimately means you have to change tests.

IMO, good tests are relatively immutable. You should be able to have multiple valid implementations. You should add new tests to describe the new functionality of that implementation, however, the old tests should remain relatively untouched.

If it turns out that a single change to an implementation requires you to change and update 20 tests, those are bad tests.

What I want as a dev is to immediately think "I must have broken something" when a test fails, not "I need to go fix 20 tests".

For example, let's say you have a method which sorts data.

A bad test will check "did you call this `swap` function 5 times". A good test will say "I gave the method this unsorted data set, is the data set sorted?". Heck, a good test can even say something like "was this large data set sorted in under x time". That's more tricky to do well, but still a better test than the "did you call swap the right number of times" or even worse "Did you invoke this sequence of swap calls".

> IMO, good tests are relatively immutable. You should be able to have multiple valid implementations. You should add new tests to describe the new functionality of that implementation, however, the old tests should remain relatively untouched.

Taken to extreme this would mean getting rid of unit tests altogether in favor of functional and/or end-to-end testing. Which is... a strategy. I don't know if it is a good or bad strategy, but I can see it being viable for some projects.

If you can't tell, I actually think functional tests have a lot more value than most unit tests :)

Kent Dodd agrees with me. [1]

This isn't to say I see no value in unit tests, just that they should tend towards describing the function of the code under test, not the implementation.

[1] https://kentcdodds.com/blog/the-testing-trophy-and-testing-c...

The goal of unit tests is to circumvent problems with performance or specificity from functional tests.

If you haven't seen those problems with yours, unit tests would be useless.

> Taken to extreme this would mean getting rid of unit tests all together in favor of functional and/or end-to-end testing.

The dirty little secret in CS is that unit, functional, and end-to-end tests are all the exact same thing. Watch next time someone tries to come up with definitions to separate them and you'll soon notice that they didn't actually find a difference or they invent some kind of imagined way of testing that serves no purpose and nobody would ever do.

Regardless, even if you want to believe there is a difference, the advice above isn't invalidated by any of them. It is only saying test the visible, public interface. In fact, the good testing frameworks out there even enforce that — producing compiler errors if you try to violate it.

Yep, the 'unit' is size in which one chooses to use. The exact same thing happens when trying to discuss micro services v monolith.

Really it all comes down to agreeing to what terms mean within the context of a conversation. Unit, functional, and end-to-end are all weasel words, unless defined concretely, and should raise an eyebrow when someone uses them.

> The dirty little secret in CS is that unit, functional, and end-to-end tests are all the exact same thing.

I agree that the boundaries may be blurred in practice, but I still think that there is distinction.

> visible, public interface

Visible to whom? A class can have public methods available to other classes, a module can have public members available to other modules, a service can have public API that other services can call through network etc

I think that the difference is the level of abstraction we operate on:

unit -> functional -> integration -> e2e

Unit is the lowest level of abstraction and e2e is the highest.

> Visible to whom?

The user. Your tests are your contract with the user. Any time there is a user, you need to establish the contract with the user so that it is clear to all parties what is provided and what will not randomly change in the future. This is what testing is for.

Yes, that does mean any of classes, network services, graphical user interfaces, etc. All of those things can have users.

> Unit is the lowest level of abstraction and e2e is the highest.

There is only one 'abstraction' that I can see: Feed inputs and evaluate outputs. How does that turn into higher or lower levels?

It took me a bit of time (and two or three different view) to finally get this. That is mostly why I hardcode my values in the tests. Make them simpler. If something fails, either the values are wrong or the algorithm of the implementation is wrong.
Comparing actual outputs against expected ones is the ideal situation, IMHO. My own preference is for property-checking; but hard-coding a few well-chosen values is also fine.

That's made easier when writing (mostly) pure code, since the output is all we have (we're not mutating anything, or triggering other processes, etc. that would need extra checking).

I also think it's important to make sure we're checking the values we actually care about; since those might not be the literal return value of the "function under test". For example, if we're testing that some function correctly populates a table cell, I would avoid comparing the function's result against a hard-coded table, since that's prone to change over time in ways that are irrelevant. Instead, I would compare that cell of the result against a hard-coded value. (Rather than thinking about the individual values, I like to think of such assertions as relating one piece of code to another, e.g. that the "get_total" function is related to the "populate_total" function, in this way...).

The reason I find this important, is that breaking a test requires us to figure out what it's actually trying to test, and hence whether it should have broken or not; i.e. is it a useful signal that requires us to change our approach (the table should look like that!), or is it noise that needs its incidental details updated (all those other bits don't matter!). That can be hard to work out many years after the test was written!

Also agree. There’s also a diminishing returns with test cases. Which is why I focus mainly on what I do not want to fail. The goal is not really to prove that my code work (formal verification is the tool for that), but to verify that certain failure cases will not happen. If one does, the code is not merged in.
The purpose of a car's crumple zone is to crumple.