Hacker News new | ask | show | jobs
by GrinningFool 3506 days ago
How does TDD prove its correctness? TDD suffers from the same limitations as the code - it generally only covers what you could think of.

It's a powerful tool, but I think any belief that sufficient test coverage (in most common cases) actively proves correctness is misguided. In the general case, even full test coverage proves only that you've tested for the conditions you expect - but does nothing to verify the correctness of behavior in conditions you didn't expect.[1]

To me the benefits of TDD are three-fold:

1. It makes you think of what you're building in more detail before you build it.

2. The methodology puts heavy emphasis on short test-code cycles.

3. (Applies to any methodology that emphasizes coverage) You end up with an acceptable-to-great regression suite[1], and anecdotally it seems people do a better job of at least ensuring tests exist when required to by the methodology.

All of these things are equally possible without TDD. Short iterative cycles and additional forethought are perfectly possible without TDD, but they do require more discipline - it is harder to remember to stop after completing a small set of changes without a forcing mechanism.

[1] rust is a possible exception here, still wrapping my head around it.

[2] the value of this regression suite varies greatly from project to project. A hint that tests are of low value /potentially high cost can be seen when you're finding that minor internal changes either break a large number of tests or reduce coverage to noticeable degree. Particularly in absence of functional changes.

3 comments

Correctness is by definition, you say what your code should do in various situations. You can then automatically verify that is the case ( using whatever method ). You need to capture "correctness" in some form. For things you can't think of, and then learn about, you add that stuff to your definition of correctness. I'm not arguing for TDD ( or against ), I'm asking if they don't use TDD, what do they do to capture correctness? I'm interested to know. TDD certainly doesn't try to cover correctness at all levels of software deployment, but it does try to capture fine grained correctness.

Sometimes correctness isn't that valuable compared to other criteria as faults can be quickly corrected and have minimal impact. But I think you need clarity about the tradeoffs you make.

EDIT: in some situations things can be corrected quickly.

"How does TDD prove its correctness? TDD suffers from the same limitations as the code - it generally only covers what you could think of."

Another limitation is that because the tests are (usually?) written by humans, the tests could also be wrong.

So you think your application works, but it turns out that both your application code and your test code were wrong and you didn't catch a bug at all.

That's the biggest drawback of any kind of tests - there is always the risk of testing the wrong thing, or not testing enough of the possible right things. and you're absolutely right - because the test is also code, it also can and often will have bugs.

When you write tests for what you're about to code, or what you just finished coding, it's a challenge to write a test that is not flawed in the same way the code is - because you don't know the flaw is there in order to test for it.

By thinking things out first you can decrease the number of these (and enforcing that discipline is a major plus for TDD ) but for typical non-trivial application without a lot of control over its inputs, it's close to impossible to do.

Tests are often reactive as well - because the changes that require them are reactive (bugs, new reqs, arch changes, etc). That doesn't detract from the value of them, but it's a limitation that explains well why even 100% coverage never stops the bugs from showing up.

I think I'd be happier with TDD if fewer people presented it as if it solved all the problems. TDD is a powerful tool, but tools are only as omniscient as the people who use them.

Testing validates that the program is correct according to some base of assumptions. You run into this a whole lot in embedded systems, where mocking hardware is difficult. You can't feasibly unit test against real hardware (most of the time), so instead you unit test against your assumption of what the hardware does and verify you respond according to requirements.

Has the benefit of proving correctness of your assumptions, which makes it easier to debug once you insert it in system and things inevitably are not 100% right. It gives you a way to reason about what your code does, what might be different, and then allows you to revise your assumptions and get your new solution in place and tested without the often long wait times to do manual testing on deeply embedded hardware.

Sometimes traditional TDD is the answer, sometimes simulations are the answer, and sometimes you need to just get out of your chair and test it out. It is a tool!