Ultimately, the only tests that matter are system tests (feature tests). These are the tests that ensure you're delivering the correct result to the customer. In practice I find unit level testing a pointless exercise.
To be blunt, that means you've not been using it right.
I totally agree that feature or behavioural tests validate that customer requirements are met. But you shouldn't (and probably can't) test more in-dept behaviour in these.
For example, I wrote a shopping basket system for a site last year. There were feature tests in there - "When I click the 'add to basket' button then I should see the item in the basket" sort of thing. Those are great. But I also wrote a whole bunch of unit tests for this - checking that calculations were correctly performed, and the adding and removing items worked correctly, and that tax was applied according to the correct rules, and so on. These tests are super-quick to run and provide a lot of confidence that the API contract is being adhered to. We could have completely switched out the back-end storage for a third-party API or something, and the tests would still be applicable.
There are loads of reasons to test behaviour in layers - I agree that you can easily over-invest in effectively pointless tests, and I've seen that everywhere. But don't discard all unit tests as worthless.
Based on my very limited understanding of your problem from your brief description, it seems like thorough and properly designed feature tests would have achieved the same thing. In other words, if there is a button in the UI for removing items from the basket, that could have been heavily tested with feature tests instead of unit tests. And those feature tests would serve well if you need to completely rewrite the code at any layer that implements the "remove item from basket" button.
There are several issues with relying solely on feature tests.
Feature tests often need to span large parts of an application, so there is a often significant amount of overhead (both code and test-time) in repeating identical-except-for-one-value.
Feature tests often can't test the corner cases of internal code. For example, one hallmark of quality software is that it degrades gracefully in the presence of unexpected inputs. So, while the UI might prevent out-of-range values, programmers often choose to also check value ranges at, for example, the top of a stored procedure. This means that you can't test that code with an app-level feature test because a correct UI won't let you enter values to trigger the stored proc's failure case.
Another big issue is the combinatorial explosion. If you have a processing pipeline, like filters in a sound or image processing app or validation and authorization checks in a line-of-business app, the number of configuration and data values that need to be tested for each stage needs multiply together if you only feature test. Unit testing allows you to make sure each of the stages works "well enough"[1], then you can use far fewer integration and feature tests to make sure that the stages cooperate properly and that system requirements are met (two overlapping, but different, concerns).
[1] However the engineer, team or industry defines "well enough".
The internal code argument is weak in my opinion. What does it matter if a stored procedure works 100% if, like you said, that code path will never be executed by the user? There will always be bugs in code so the goal in my view should be to make the application as bug-free as possible from the user's point of view not literally bug-free which is an unattainable goal.
Time-consuming nature of feature tests can be an issue, but often is mitigated by automated testing on commit, merge, etc. But not always of course.
I agree with combinatorial explosion however it can sometimes be mitigated by procedurally generating tests.
In a complex app, there are many code paths that are executed by users that are easier to test with unit tests than feature tests. In general, unit tests are easier and faster to write than feature tests if (1) you're experienced writing unit tests (there is a learning curve) and (2) your application design supports good unit tests.
Feature tests are fine for simple or small apps, but I wouldn't rely on them for the bulk of my testing in any significant app.
I thought the exact same thing, until I was working on a project where we implemented separation of duties (Everyone works in a random pair team for 2 weeks on a few tickets) and rotation of duties (No one on the project owns any piece of code). I was handed another developers code and it was a scary experience. My paired team mate and I were so scared to modify anything because the developer made arbitrary design decisions and never refactored them out of the code. There were tons of unused lines for things that "used to be" there. No one knew what was doing what so, it slowed our velocity, and everyone always asked the original dev what was going on. Eventually, except that dev, everyone quit the company so I'm curious what code quality looks like over there now.
The unit level testing in our other projects helped me figure out very quickly what was going on. I could read the method names and it was the the cliff notes to a book. And then based on those test passing, I would assume the notes were correct. I'm sure the experience isn't the same for everyone and I'm still very new to TDD, but after having one really good experience, I'm all for it. My tests have tests.
If you work on a team and/or on a large codebase with many moving parts, unit tests on isolated parts are pretty much a necessity (by which I mean, the hands-on observed cost of not having them vs. having them, becomes so great and apparent that it is considered necessity)
Secondly, system tests won't identify the specific "faulty part". Unit tests will. That can save quite a bit of time, assuming you have enough of both types of tests.
Thirdly, unit tests inform good component design. If your unit test is hard to write, the component under test is typically either not structured well, highly coupled to other code, or some other deficiency that will result in more bugs over time, which all become readily apparent when you try to unit test it. You will end up refactoring the code under test and it will often "feel better" for lack of a better description.
Fourthly, if all you rely on is integration/acceptance/"comprehensive" tests, you will have tests that run some parts of the code hundreds of times over in a test suite run, which is incredibly wasteful. For example, a workflow integration test which requires someone to log in and try various things, will have to run the login/authentication code dozens of times, when you already know it works.
System-level tests are important, but they're not much help with refactoring code. For some applications, particularly in my domain of analytics, it's very important that we can do that effectively.
When you're refactoring code to reduce coupling, you absolutely HAVE to do it with system level tests.
Doing it with unit tests simply means that you'll end up writing the test, refactoring and then completely REwriting the tests AGAIN to get them all to pass because you're changing the method contracts and the objects being mocked.
Tests that fail every time you refactor are totally meaningless and a waste of time. They don't detect bugs. They just detect changed code.
> Tests that fail every time you refactor are totally meaningless and a waste of time. They don't detect bugs. They just detect changed code.
I agree. This is where I think the distinction between classical and mockist testing [1] is useful. These days, most TDD involves mocking or stubbing every single dependency, effectively turning your units into a white box - "isolated TDD." When one has code whose implementation is known by and manipulated by client code, refactoring will almost certainly break stuff.
Why would you want to liberally refactor code when you know you will break 10 of your tests and have to rewrite them?
One frequently sees hardcore TDD advocates patting themselves on the back for isolating everything, because... now they can swap the database for a third-party API, in-memory store, remote service, or whatever. Really? You're going to replace the database with something that has wildly different reliability constraints? And why would you ever need to replace your database with a remote third-party service? I'm sure it can be useful, but for most people, YAGNI. Perhaps I've merely not worked on enough Web Scale™ or Big Data™ projects.
>> When you're refactoring code to reduce coupling, you absolutely HAVE to do it with system level tests.
I think this comes about because tests are written against every class in your system. I find unit tests are far more useful if you focus on testing abstractions rather than every single class e.g. you have a reporting abstraction in your code, instead of testing every class used within that abstraction you only test the public API that you want to expose. This allows you to do black box testing which is infinitely better when it comes to refactoring, you should be able to restructure the internals of that particular API without having to change your unit tests at all.
My feeling is that a lot of the frustration with TDD at the moment is that people are writing tests for every public method in their system. If you focus more on the behaviour of your abstractions you gain a lot more freedom when refactoring and can greatly reduce the number of tests you write without reducing coverage.
>I find unit tests are far more useful if you focus on testing abstractions rather than every single class e.g. you have a reporting abstraction in your code, instead of testing every class used within that abstraction you only test the public API that you want to expose.
Yes, this is exactly what they're useful for. Unfortunately, if you have a big ball of tightly coupled muddy code and you're working on prying it apart and creating useful abstractions you can't use unit tests to get there.
The only way you can do test driven refactoring in that case is to create system level functional tests and then rework the code underneath them. Once you've got decent abstractions and a solid set of APIs and only then you can start writing unit tests against them.
The definition of a refactoring (at least the one I give juniors) is the modification of a system of code such that an external contract remains valid. An external contract is typically validated by tests (though it doesn't have to be). If your refactoring crosses a unit boundary, you're certainly going to have to test at a higher level (assuming you want the safety of a contract validation through tests). Otherwise you're modifying the contract of the system under test and you've crossed over from refactoring to redesign and reimplementation.
I totally agree that feature or behavioural tests validate that customer requirements are met. But you shouldn't (and probably can't) test more in-dept behaviour in these.
For example, I wrote a shopping basket system for a site last year. There were feature tests in there - "When I click the 'add to basket' button then I should see the item in the basket" sort of thing. Those are great. But I also wrote a whole bunch of unit tests for this - checking that calculations were correctly performed, and the adding and removing items worked correctly, and that tax was applied according to the correct rules, and so on. These tests are super-quick to run and provide a lot of confidence that the API contract is being adhered to. We could have completely switched out the back-end storage for a third-party API or something, and the tests would still be applicable.
There are loads of reasons to test behaviour in layers - I agree that you can easily over-invest in effectively pointless tests, and I've seen that everywhere. But don't discard all unit tests as worthless.