Hacker News new | ask | show | jobs
TDD Doesn't Work (blog.cleancoder.com)
153 points by narfz 3506 days ago
23 comments

Commenting on TDD stories here is historically a bad practice, but i'll add my input here.

I have never let my teams go full TDD. The reason is that in all my experience, TDD sacrifices a lot of velocity for the sake of automated tests. When i hear about the reduction in total bugs injected, it is a "duh" moment. The fastest way to make a team inject 30% fewer bugs is to have them write 30% less code. That isn't snarky, it's true.

Automated testing is one of the many tools available to software engineers. And it is a valuable one. Unfortunately, TDD is too much of a good thing. It relies so heavily on automated testing that it ventures far into the realm of diminishing returns.

Once, in an argument about TDD, i said it was akin to having someone build a shed. But upon checking in on them, you saw they were using a hammer to smash screws into boards. When you ask them what they are doing, they tell you it is Hammer Drive Construction. It is perhaps overly harsh, but it reinforces the point: tools have a place. Automated tests really shine on mission critical logic that does not get rewritten often. Use it where it makes sense. I wouldn't recommend using it ubiquitously.

Then again, i also recommend having fun coding. So i suppose the actual message here is: do what makes you successful, not what comments or studies say.

I find that automated tests shine pretty much all of the time, provided they're relatively cheap to build, cheap to maintain and not buggy.

Where they fall down is when they're more expensive to build than the code under test and they produce false positives/negatives.

I agree. The problem is that almost inevitably they end up getting more and more expensive to build and maintain while at the same time becoming buggy ;) It's a difficult battle to win.

I think you need self discipline to keep limiting yourself to an ever evolving subset that is the optimal ROI. This means over time removing tests that don't add as much value any more. Rewriting some other tests. etc. Human nature though is that these keep growing endlessly and become hard to manage just like any other part of software.

Hmm, I think this arises from a mindset that doesn't treat tests as part of the code that should be maintained and refactored.

You always have to be removing and refactoring tests if you are changing your production code. Changed a sorting algorithm? Good, go and have a look at your tests to see whether there's anything that doesn't need to be tested anymore, or edge cases that need to be tested now that the algorithm has changed.

Red -> Green -> Refactor

Unit tests tend to get more expensive with time (but not always do). Other kinds of test behave differently.
You say "do what makes you successful" after "I have never let my teams go full TDD".

If someone on your team is most successful with TDD, do you still not allow it?

re: writing 30% less code - I've found TDD can reduce my percentage of lines of code, as you suggested. Adherence to the "refactoring" part encourages that you reduce duplication, which in my experience has been easier to do with good test coverage.

You are correct. My use of the singular "you" was more directed toward people in their own projects. For the purposes of a team at a company, you can think of it as a collective "us". We do not use TDD.

I would say that the most successful teams i have been a part of focus not on automated testing but instead on other collective practices: informal code reviews, diff analysis of every commit, group discussion of database changes and collective manual testing of other's code. Many people point to the refactoring (or initial code organization) as a benefit of TDD. I find these other practices tend to inspire a more collective ownership of the system. Additionally, and more importantly, they spur a lot of conversation around how and why to organize code certain ways. These learning opportunities are probably the most valuable among young and growing teams.

Sure, these practices can help build a healthy engineering culture, and I agree with them all, especially collective manual testing.

How do you keep people from doing TDD though? How do you even know they are doing it?

That's been my experience. I would add that TDD is the antithesis of "agile", since any changes you make to your product will require changes to the tests. Sometimes large changes.
That depends on the change and the tests. Ideally the changes to individual tests should be zero (for irrelevant tests) and changes to the test fixture should be minimal.

Having said that, unit tests in the wild have a tendency to be abominably written, so I'm not surprised a lot of people get frustrated changes the tests.

In my experience, the large changes to my tests were a result of having to make large changes in product behavior.
Only if you design your tests that way. Tests are just software. If a change to one part of a software system requires massive changes to another part of that same system; then the system is poorly designed. Indeed, that may be the very definition of poor design.

So if a change to your production code causes large changes to your test code, then one, or the other, or both are poorly designed. You have neglected the design. You have allowed couplings to proliferate.

so what's your approach to ensuring the software you deploy is correct?
How does TDD prove its correctness? TDD suffers from the same limitations as the code - it generally only covers what you could think of.

It's a powerful tool, but I think any belief that sufficient test coverage (in most common cases) actively proves correctness is misguided. In the general case, even full test coverage proves only that you've tested for the conditions you expect - but does nothing to verify the correctness of behavior in conditions you didn't expect.[1]

To me the benefits of TDD are three-fold:

1. It makes you think of what you're building in more detail before you build it.

2. The methodology puts heavy emphasis on short test-code cycles.

3. (Applies to any methodology that emphasizes coverage) You end up with an acceptable-to-great regression suite[1], and anecdotally it seems people do a better job of at least ensuring tests exist when required to by the methodology.

All of these things are equally possible without TDD. Short iterative cycles and additional forethought are perfectly possible without TDD, but they do require more discipline - it is harder to remember to stop after completing a small set of changes without a forcing mechanism.

[1] rust is a possible exception here, still wrapping my head around it.

[2] the value of this regression suite varies greatly from project to project. A hint that tests are of low value /potentially high cost can be seen when you're finding that minor internal changes either break a large number of tests or reduce coverage to noticeable degree. Particularly in absence of functional changes.

Correctness is by definition, you say what your code should do in various situations. You can then automatically verify that is the case ( using whatever method ). You need to capture "correctness" in some form. For things you can't think of, and then learn about, you add that stuff to your definition of correctness. I'm not arguing for TDD ( or against ), I'm asking if they don't use TDD, what do they do to capture correctness? I'm interested to know. TDD certainly doesn't try to cover correctness at all levels of software deployment, but it does try to capture fine grained correctness.

Sometimes correctness isn't that valuable compared to other criteria as faults can be quickly corrected and have minimal impact. But I think you need clarity about the tradeoffs you make.

EDIT: in some situations things can be corrected quickly.

"How does TDD prove its correctness? TDD suffers from the same limitations as the code - it generally only covers what you could think of."

Another limitation is that because the tests are (usually?) written by humans, the tests could also be wrong.

So you think your application works, but it turns out that both your application code and your test code were wrong and you didn't catch a bug at all.

That's the biggest drawback of any kind of tests - there is always the risk of testing the wrong thing, or not testing enough of the possible right things. and you're absolutely right - because the test is also code, it also can and often will have bugs.

When you write tests for what you're about to code, or what you just finished coding, it's a challenge to write a test that is not flawed in the same way the code is - because you don't know the flaw is there in order to test for it.

By thinking things out first you can decrease the number of these (and enforcing that discipline is a major plus for TDD ) but for typical non-trivial application without a lot of control over its inputs, it's close to impossible to do.

Tests are often reactive as well - because the changes that require them are reactive (bugs, new reqs, arch changes, etc). That doesn't detract from the value of them, but it's a limitation that explains well why even 100% coverage never stops the bugs from showing up.

I think I'd be happier with TDD if fewer people presented it as if it solved all the problems. TDD is a powerful tool, but tools are only as omniscient as the people who use them.

Testing validates that the program is correct according to some base of assumptions. You run into this a whole lot in embedded systems, where mocking hardware is difficult. You can't feasibly unit test against real hardware (most of the time), so instead you unit test against your assumption of what the hardware does and verify you respond according to requirements.

Has the benefit of proving correctness of your assumptions, which makes it easier to debug once you insert it in system and things inevitably are not 100% right. It gives you a way to reason about what your code does, what might be different, and then allows you to revise your assumptions and get your new solution in place and tested without the often long wait times to do manual testing on deeply embedded hardware.

Sometimes traditional TDD is the answer, sometimes simulations are the answer, and sometimes you need to just get out of your chair and test it out. It is a tool!

As someone I know likes to say, write better code?

It's a given that software needs to be tested. The processes around that are a classical "it depends" question.

Most likely any software you deploy will never be "correct" (whatever that means). The quality of that software depends on many variables and it's up to you to try and tweak them while optimizing for things like cost, time etc. Whether it's worthwhile to write the tests ahead of time or after the fact or to do them manually or automatically or any other permutation is just not a question that can be answered in a way that applies to all situations.

>It's a given that software needs to be tested. The processes around that are a classical "it depends" question.

Yes. What you need to do to ensure the correctness of flight software for a jet fighter is entirely different than what you need to do to ensure the correctness of an internal tool at your company that automates a task for which you already have a manual process.

That's like saying, if you're poor, you should buy more money.
Having a good type system would be a good start :)
Don't upset the scripters ;).
I agree :)
You can use automated unit and integration testing without doing "TDD". The overhead isn't even particularly high; you have to test any piece of code you write anyway, so you might as well put in a tiny bit of extra effort and have unit testing.
"TDD" is not the same thing as "having unit and integration tests".
I think that this all goes straight back to the old "mockist TDD vs classical TDD" debate.
could you elaborate? i think this is my ignorance. The one shop that I worked in that insisted on 'mocks' meant that i wrote some code, then ran that code on some inputs, recorded the outputs, and then wrote a harness which validated that those inputs matched the outputs.

which meant that changes to the code might result in a failed mock, but didn't say anything about coverage or correctness. i can't imagine a more useless testing strategy.

is that what mockist TDD is commonly understood to be?

Yeah, that is.

In what's often termed the "classic" approach, you instead lean toward writing more coarse-grained tests, and you don't shy away from integration tests. You don't avoid mocks, but you tend to prefer saving them for situations where it really is hard to force a collaborator to behave in a certain way. (I also try to stay on guard for the possibility that those situations are code smells indicating that your implementation is getting to be too complex and is due for a refactor.)

IMO, the main argument against classical TDD is that you tend to get a suite of tests that runs more slowly and has more dependencies on external resources such as the database.

IMO, the main arguments against mockist TDD are that you end up with test-induced design damage, and a brittle suite of tests that makes your codebase resistant to refactoring.

I'm definitely in the TDD-is-not-a-one-size-fits-all programming style camp and I'm glad to see a study that supports that conclusion. I was at Railsconf when DHH said his bit in 2014. My office and I followed the subsequent debates between him and Kent Beck (since my dev group was largely pro-TDD). Lots of anecdotal arguments. It's nice to see some more quantitative data on this!

In my programming experience I've found that I prefer to write tests AFTER I do development of a new feature. Oftentimes the implementation is in such flux that continually updating the test as I go along is tedious and kills the creative flow.

However, when it comes to fixing bugs in existing software, I find it more helpful to write a test that duplicates the bug FIRST, then code the solution.

If anything, the reason to recommend TDD is simply to enforce writing tests to begin with. It's so easy to get a feature working and gloss over testing it.

EDIT: What's up with liquidise's statement about commenting on TDD stories being bad practice? Do the TDD fanatics downvote to hell everything anti-TDD?

> Do the TDD fanatics downvote to hell everything anti-TDD?

I didn't mean to criticize either stance with the statement. I said that because i find most TDD threads on HN get very heated, with commenters being highly polarized and entrenched in their opinions. I've avoided commenting them on the past because of this. But i am happy the discussions under this story are a great deal more civil and informative.

> Do the TDD fanatics downvote to hell everything anti-TDD?

Did you notice the bias in that question?

I'm sorry the sarcastic inflection does not come through the online text.
I read the article, but I haven't read the study. The article seems to show that the study is useless - it proves nothing, it's neither pro-TDD, nor anti-TDD. Taking a seemingly strong side of an argument supported by the study, but not commenting on the linked article itself, can explain the downvotes.
If the study showed anything, it was that TDD and TLD might be similarly useful at least on short timescales. I think GP was saying that they believed tests were useful, but that it was unnecessary to strictly stick to doing them the TDD way.
So I read the study. Martin Fowler is correct that you can't make huge generalizations from the study. But the importance is that someone is actually attempting to get quantitative evidence in an argument that has been largely anecdotal for the past several years. As flagrant as DHH's keynote was, back in the day, I think he was right to compare TDD to a fad diet. Yeah, there are people that will absolutely swear by it. But we really won't know until actual studies are done.
Before I go in, I will state that in 99% of the times I'm a TDD hater. Actually I don't even like writing tests after the fact because I just like to build and move on.

I could never understand why the ruby on rails tutorial insisted on walking newbies through TDD and skipped all chapters where they start talking about tests when I started learning rails. I still think it's a bad idea to make newbies do all the weird TDD stuff when they don't even know how to build something.

I'm so opinionated about this that most people around me know this. And in most cases it works without needing to write any tests. And even if something fails, I can quickly patch it. As long as I wrote the app in a nicely modular way, I've not had much problem.

That said, right now I'm working on writing a JS library. And believe it or not, I AM doing TDD right now. I can't believe it myself.

I think in cases where the logic involves a lot of intricate details, it's impossible for me to write something without writing tests. I'm not talking about simple web apps. I'm talking about stuff like: template engine, parser, etc.

My current setup: I write a test and document it before I write a function. That way I don't get carried away while implementing and know exactly what I'm trying to build. Then I write another function that utilizes that function I just wrote, and so forth. This way I know when the next function doesn't work for some reason I know exactly where something went wrong. Instead going back and debugging every single function used along the way, I know it's the most recent one that's causing the problem.

So my conclusion: you probably don't need to write tests for all your stuff, but there are indeed cases where you will NOT be able to proceed without writing tests.

Once you release a product that will be used by many customers and developed by many people throughout its lifecycle, which come and go as the time passes, you won't be able to maintain/extend it without a proper testing suite. It's not only about complexity, but also about maintainability. Some tests will also rot in time.
Agreed, in my case what ends up happening is I start out with no tests but soon the project becomes huge and i have to start writing tests if I want to push stuff without fear.

The thing is, most large companies have a QA team so this fear is not super tangible to many developers. And small startups are more focused on building stuff quickly (which they should be).

I think this is why this topic has been polarizing. Some people feel the need and some people do, depending on which role you're playing in your organization.

Nowadays interestingly, even the large companies are moving towards more testing because they can cut QA costs that way.

Yeah... QA teams aren't really the norm anymore. Not to mention that classical QA isn't nearly as effective when developers aren't also performing their own QA.

Small startups that focus on quickly building stuff have to decide whether to take on more technical debt in order to get something small out the door quickly, or settle on a maintainable velocity over time. Some begin at the former and move quickly to the latter once their MVP is out the door. Some never get to move because they sink under the weight.

> most large companies have a QA team

Not sure what companies you are referring here. Google and Amazon does not have QA team for most of dev team.

I worked at Amazon and now work at Google.

There are many other large companies other than Amazon and Google.

Just because Google and Amazon don't have QA team doesn't mean the occupation doesn't exist.

I do not intend to say QA as an type of engineering position does not exist. What I was responding to is the parent's claim that "most large companies have a QA team so this fear is not super tangible to many developers"; that's not true in general.
Oh man, it is really funny that he ends it with telling people to read the study.

To clarify the linked study is attempting to replicate https://dl.acm.org/citation.cfm?id=1070834, THE seminal study in Test Driven Development. Well to be more precise it was replicating an existing replication of that study which failed to replicate the original results. They were trying to modify the design so as to account for issues in the experimental design that may have led to the replicated study being inconclusive.

This is significant because if you were not aware of the failed replication, and believed that TDD was supported scientifically as more productive because of that original study, then you SHOULD be reconsidering its place in your development process. If that isn't the case your opinion is unchanged by these particular results(even in the article inspiring this one the author admits that their opinion was already based on a much more thorough analysis, see: http://neverworkintheory.org/2016/10/05/test-driven-developm...).

Now what I want to know is why people insist on writing articles in this awful conversation format. It wastes a lot of words to make a simple argument poorly.

Probably because they have a sense of humour, and the conversational style makes it more entertaining to read...?
Is it entertaining or funny if you identify with the character who is talking down? I mean this particular example doesn't seem to contain any jokes from my reading.
I use that style, from time to time, because I like it. ;-)
tl;dr = someone did a study that used a methodology that confirmed that working in small chunks and writing tests as you go is good, but that it's not very important if you write the tests before the small chunk of code or after the small chunk of code.
Aren't tests supposed to be a tool to help design API? in that perspective a test should be written first. The problem IMHO is the choice of methodology as there is several kind of tests. Some may be more time consuming when it comes to the set up.
Personally, I think unit tests shine best when you're designing an API. I can swing from hate to love and back about TDD in minutes, but when it comes to thinking about how your code will be used, unit tests (did we stop using that term?) are a tremendously useful tool I have.

I guess if all code written could be seen as an API, TDD would be great, but that's not the world I live in.

This is why I prefer free to implement open standards that are (hopefully) well designed and specific.

Inter-operablility and interchangeability of parts means that it's possible to validate an implementation to at least some degree.

The best example that I can think of off the top of my head is the OpenGL 4.4/4.5 work that is nearing conformance for the Mesa3D project ( https://en.wikipedia.org/wiki/Mesa_(computer_graphics) ); while the functional coverage for the main modern drivers is 'complete' the official conformance testing has already resulted in some bug fixes and additional areas to focus on improving.

That real life case study is yet another example of how an API and conformance tests built around that API result in better code and in a more consistent experience that isn't dependent upon a mono-culture implementation.

> I guess if all code written could be seen as an API, TDD would be great, but that's not the world I live in.

If not an "Application Programming Interface", isn't all code an Interface? There's input and there's output.

With Object Oriented programming, that there is an interface is more explicit (even if all you're doing is implementing objects that are already tested). There are function call argument (type) specifications (interfaces) whether it's functional or OO.

Unit tests can help with designing a testable API, not necessarily a usable, performant, secure, etc API.

Design remains design, there is no quick implementation trick that makes it simple.

Confirmed that if you work in small chunks, it does not matter if you write the tests first or last.

That part about it being good to work in small chunks isn't there.

In all discussions about TDD, it is important to distinguish between having having an automated test suite for your code which is run frequently, and writing your code test first - which is what TDD is, by definition.

It is possible to advocate for the former, while thinking the latter is consultantware snake oil. (my position, fwiw)

Yeah, I agree. TDD is also, you cannot write a single line of code without writing a failing unit test though. Well, I'm not a fan of that either.

I mean, sometimes I like writing an acceptance test to begin with and work inwards.

People sometimes interchangeably use "TDD" for "testing". Also, just because you do TDD, doesn't mean your code can be great, I've seen people assert pointless things and the unit has gotten so small that people now define them as methods in classes. Which I also think can lead to some crazy maintenance suite of tests.

If you're interested, this was a fun discussion about TDD between two professionals: https://www.youtube.com/watch?v=KtHQGs3zFAM Jim Coplien and Uncle Bob.

The author was going somewhere when he began writing about what a developer is thinking about, but, perhaps because he was focused on vindicating TDD, he did not arrive there.

A developer who is writing unit tests must have a good idea of the purpose of the target of the tests, so she is thinking about requirements. Furthermore, if she is writing unit tests for small components (which will often be the case on account of everything being done in short cycles) then a lot of that purpose is contingent on other aspects of the design and how it is all supposed to work together: in other words, she is thinking about design.

If you don't spend some time thinking ahead about big-picture requirements and design issues, you are in danger of going a long way down a dead end.

I thought that TDD morphed into ending up with a regression/integration/conformation test suite instead of using tests as specifications written prior to writing products. And even 100,000s of tests won't help you in very advanced applications like cloud/cluster infrastructure as sometimes it's simply too difficult if not impossible to come up with tests (imagine observer effect when your cluster deadlock happens only in certain rare nanosecond windows and adding a testing framework will make you miss those windows and the problem never happens) and people with mental capacity capable of writing them (e.g. Google/FB-level) are better utilized in writing the product itself.
Why you think that debugging and fixing deadlock in cluster in production (of multibillion business) is easier and cheaper than writing of functional test case? Maybe you just prefer trips to angry customer versus boring office work. :-/ http://www.reuters.com/article/us-nasdaq-halt-tapec-idUSBRE9...
The thing is that there are problems we simply can't solve in theory nor in practice, yet we use approximate solutions all the time - and that is the case of advanced distributed algorithms. In theory, we simply can't handle real-world asynchronous systems. And when we pretend we have partially synchronous systems and build abstractions around them, they aren't 100% working. Now add in some complex bugs (like getting a distributed deadlock in transacted system involving exactly 7 nodes but not less nor more) and you might start understanding why functional test case might not really be an option to avoid these issues (you can obviously write them but they won't really help you). I worked on such a system, we had 100,000s of tests yet they were clustered around known issues and not issues that happened when e.g. a node went down and up, data were out of sync and sockets between nodes were becoming full due to OS' performance limitations. And moreover, many of these issues start showing up only when you push throughput to the max, e.g. during trading spikes etc. and adding a test that checks invariants would lower the throughput and those issues simply won't show up anymore.
Each test case increases confidence in the system by small amount. Confidence never can approach 100% (because we need to predict future to achieve that), so no amount of testing can give you 100% confidence, only 80%, 95% (2x price), 98% (4x), 99,5% (8x), 99,95% (16x), 99,995% (32x), and so on. It's your message, right?
My message is more like your confidence after writing hundreds of thousands tests might be just 50%. From my own experience, every single bad case that can happen in a complex system will happen at some point at some customer, wrecking their system and costing them potentially millions, in serious trading bugs even leading to a bankruptcy. Your testing suite won't catch these initially but reactively when you add that test case to your regression suite after bad things happened. In complex systems, tests are just a heuristics for quality, not really something you can rely on (but it's way way better to have them than not). Often tests are clustered around low-hanging fruit or around parts of system used by developers or initial customers and any deviation in usage patterns can cause an outbreak of new, unexpected incorrect situations. Similarly, proving correctness using some formal verification tools might increase your confidence, but won't give you 100% either, as we simply can't model reality properly even within our own frameworks :(
In such cases, I use "torture" test cases: lengthy, random test cases, which are trying to abuse and overload system with no data, incorrect data, random data, huge data, high latencies, duplicated messages, missed messages or random aborts, random speaks, etc. They allows me to discover situations not covered by test cases. I also try to use underpowered hardware for such testing. Of course, I cannot imagine all possible torture scenarios, but I saw lot of bug and security reports, so I still know lot of scenarios, more than I willing to write tests for.
How do the people who write them know that they work?
> How do the people who write them know that they work?

Test first isolates that a given test doesn't already pass (without any additional code).

Test after (but before committing) also seems to require a more thorough critical analysis.

And then someone finally fuzzes the code.

TDD presents a paradox that requires split-brain thinking: when writing a test, you pretend to forget what branch of code you are introducing, and when writing a branch, you pretend to forget you already knew the solution. It is annoying as hell.

You CAN indeed cover all your branches with tests afterwards. You can even give that a fancier name, like "Exploratory Testing". Of course it may be more boring or tedious, but is a perfectly valid way to ensure coverage when needed.

TDD was great for popularizing writing test first; However I much prefer the methodology called CABWT - Cover All Branches With Tests. Let the devs choose the way to do it, because not everyone likes these pretend games.

TDD requires you to write FUNCTIONAL test first, not unit tests you are talking about.
I was commenting on the methodology as I heard and watched it explained by the author (Robert C Martin), as well as the way it was presented in his videos.

TDD workflow is fine; it's not thinking about the pink elephant (the source code) idea that bugs me.

Robert Martin is author of Agile manifesto.

https://www.quora.com/Why-does-Kent-Beck-refer-to-the-redisc...

The original description of TDD was in an ancient book about programming. It said you take the input tape, manually type in the output tape you expect, then program until the actual output tape matches the expected output.

+1. TDD could be considered as a derivation of the Scientific Method (Hypothesis Testing).

https://en.wikipedia.org/wiki/Scientific_method

https://en.wikipedia.org/wiki/Hypothesis

Test first isolates out a null hypothesis (that the test already passed); but not that it passes/fails because of some other chance variation (e.g. hash randomization and unordered maps).

https://en.wikipedia.org/wiki/Null_hypothesis

... https://en.wikipedia.org/wiki/Test-driven_development

+1 Right on the spot.

TDD requires you to draw your target first, then hit or miss it with the code, like in science: hypotheses -> confirmation/declining via experiments -> working theory.

But in practice, lot of coders are hitting a point instead, then they draw target around that point, like in fake science: we throw coin 100 times, distribution is 60/40, our hypothesis: random coin flip has 60 to 40 ratio, our hypothesis confirmed by experiment, huge savings, hooray!

Author is only partially right about TLD being as "doing TDD in your head", since it's (at least for me) in a much more abstract form of a general idea, a concept, of what I want to achieve. When using TDD you need to come up with the very specific results that you will test and you need then to implement those specific tests, to the last line of code. This means that if you make any changes to the logic afterwords, you need to throw away your pre-written tests and write new ones, the time spent on writing them was wasted. TLD is much more flexible and easier to update, no code is thrown away if you change something. Before I start I just need to decide what I'm trying to solve with my current block of code, and then I later write a test to check if I did it properly. Then I do the next block of logic, and the next test. Since code blocks are directly related to the steps in my logic, it's very natural to come up with the tests for them, just test if the things work as you planned it. If in the middle of that work I realize that I need to do something in a completely different way, there's no pre-written tests, so no time was wasted on coding tests that were never going to be used. And, at least to me, this kind of situations happen a lot, I often refactor and improve things as I work on them, so for me TLD is much more suitable approach.
tl;dr the recent studies proved that that you're testing first or last doesn't matter, provided you're frequently flipping between writing a test and writing code.

The author thinks that TDD is preferable because it helps you maintain discipline.

I personally think it's worthwhile besides that because it means you design the API before implementing, meaning it is cheaper to fix API design mistakes. IIRC this aspect wasn't actually tested in the studies (API signatures were given up front).

And that's one of the things TDD gives you - as you write the test, you have to use the API. If that's painful or even just awkward, it's telling you something...
Except that it is not real use.

Might be better than nothing, but if you are designing a reusable API, you'll be better using it on some real code.

>Might be better than nothing, but if you are designing a reusable API, you'll be better using it on some real code.

This is generally why I take the top down approach to coding:

1) Write ultra high level test

2) Write code that implements it.

3) In that code if I need something lower level, write the API that I want

4) Write a lower level test that mirrors what I just wrote in real code.

5) Implement the code that passes that test & use it at the higher level.

etc.

> Might be better than nothing, but if you are designing a reusable API, you'll be better using it on some real code.

Agreed, this is why tests should be driven from the outside in.

But at the unit level, tests are often the first contact with production code. Tests are dumb and setting up complicated stateful worlds is painful and tedious.

It becomes easier to simplify the production code to simplify the test code, than to just write the first thing that comes to mind.

Let's say I have a new theory, called Understanding Driven Development. The system says:

It's a bug if someone needs to change code and they, at any moment, see code they don't understand. Stumbled into the wrong place? Bug filed for better notes on organization. The code you need to touch not understood? Understand what you see before you make a single change. If you change code and don't update docs, or documentation and code out of sync? It's a bug, and changing one to match the other _without detailed understanding_ is a bug too!

Now, that seems reasonable. And if a study comes out and says people can't make program changes faster, on average, when participants are given a bit of code identical, but with more (accurate and non-trivial) comments, that doesn't mean UDD doesn't work. It doesn't test it on real, full size applications. The code was the same, despite clarity of code is one of the goals of UDD -- one of the core claims is that UDD gets you better code to begin with. It focuses on a tiny test of something not necessarily core to the UDD mindset.

But it's evidence that at least one claim I've made is false. In fact, that study would be enough for me to throw that idea set into the garbage.

I work at ThoughtWorks. TDD is central to everything that we do. That said, like anything else, TDD done to an extreme is probably a bad thing (too much time spent on tests rather than implementation) and it not being done at all is also usually bad (too much time fixing bugs that could have been caught by tests written beforehand).

Balance is key.

I prefer to rather then write tests plan out the interactions between all components in large projects. This will show you how all the pieces interact and what cases to need to handle in each functional unit. After this, I sit down an write all the code.

After I know the organization of the source, I write out each functional unit of the code one at a time. As I go, I write each bit of test code for my source. After this I integrate every function unit.

If a change is needed, I go back to the drawing board and find a better overall organization. This happens often due to either performance constraints or the need to abstract a section further.

After this I'd consider embedding a unit test suite.

Works great for small to medium projects.

Previous 300+ comment thread which referenced the actual paper, not a blog post about it:

https://news.ycombinator.com/item?id=12740456

For those like me who enjoy HN but aren't s/w developers:

https://en.wikipedia.org/wiki/Test-driven_development

Careful -- in these studies the subjects are writing tests before writing code.

In practice there are 'test-heavy' devs who use factory data and the test suite to run skeleton code with crashpoints, and switch actively between test and imp files.

This has tests & implementation being written in parallel vs strict TDD which has us finishing tests before writing program logic.

Most test suites depend not just on functional requirements but also on implementation details, so it seems obvious that tests-before-logic development is inefficient.

TDD requires to write FUNCTIONAL test case first for every new feature. Functional test case should not depend on implementation. Integrational and unit tests are.
I must be one of the very few people who can write working and mostly bugless code and without writing any kind of test. Writing tests feels like the most wasteful and possibly harmful thing to me (like by people forcing dependency injection etc. where otherwise unneeded).

I don't really know what to think of the situation? Is this how it has always been? Do most software engineers really have no idea what they're doing?

> I must be one of the very few people who can write working and mostly bugless code and without writing any kind of test.

Or you may be using a different definition of "mostly bugless" than the rest of us.

I do gamedev. The ability to patch post-release is not a given, even today, for all platforms. Crashes, corruption, progress blockers, etc. are all VERY BAD in this environment.

I see bellow you're writing network code in C. I don't suppose you've done any fuzz testing? Run with address sanitizer? Static analysis? We live in a world of exploitable 1-byte buffer overflows. Maybe not such a big deal for a throwaway blog server, but perhaps a bit scarier if you might be facing HIPAA fines, or running industrial equipment.

A very important note here: Mostly bugless as far as you're aware and mostly bugless in actuality are two very different things. Without testing, I'm not sure how you can have any confidence that you're in the latter camp.

For my hobby projects I admittedly don't do much testing - simply because no one is paying me to. Only in rare cases do I intentionally write a test when something seems complex. And I do run address sanitizer etc. where that can reasonably be done (e.g. not on microcontroller code). Anyway my point is, my code has considerably less bugs than what you'd get from reasonably proficient programmers even if they DID write tests.

For example, I'm currently writing a TCP/IP stack for embedded systems [1]. While it's not quite complete yet (misses some essential code like fragmentation and congestion control), I'm very confident that it has (and will have when complete) much less bugs than related portions of lwIP; see for yourself all the bugs I've found in lwIP [2].

Again feel free to find bugs in my code. I very much appreciate people pointing out bugs, as it helps me make even fewer bugs :)

> We live in a world of exploitable 1-byte buffer overflows.

Indeed. But buffer overflows are so easy to avoid, just don't write over the end of the buffer. I doubt I've done a buffer overflow in years. The bugs that I do make, are much more complex.

[1] https://github.com/ambrop72/aprinter/tree/ipstack/aprinter/i...

[2] https://savannah.nongnu.org/bugs/index.php?go_report=Apply&g...

How do you KNOW your code is working and bugless?
That's a good question. There are different ways: - Play around with the program intelignetly and observe no bugs (you wouldn't believe how many bugs I've found by that method, that have been missed by very formal verification processes). - Don't call it done until you haven't proven to yourself that the code has no bugs. This is an informal process of self-code-review but which involves quite rigorous thinking about the behavior of the code. Often assertions are involved (which all have to be proved).
> Play around with the program intelignetly and observe no bugs

What if the computer encoded your knowledge?

> Don't call it done until you haven't proven to yourself that the code has no bugs.

What happens when the software changes? Do you repeat every single desk-checking exercise to ensure nothing has broken?

Do you even remember every click, every experimental input?

Can you prove that you do?

> This is an informal process of self-code-review but which involves quite rigorous thinking about the behavior of the code.

I trust that smarter developers than me are smarter developers than me.

But I am dumb. I assume that the code is smart and that my mental simulation of the code, which my brain helpfully and invisibly patches on the fly, is correct.

But my mental simulation is frequently wrong. So I wrap myself in explicit statements of what I think the code does. Then I make those explicit statements executable. And then I run them frequently.

And frequently, I realise again that I am dumb and I should leave the flawless coding to others.

> What if the computer encoded your knowledge?

> Do you even remember every click, every experimental input?

> Can you prove that you do?

I agree some tests are a good idea depending on the project. Doesn't mean I have to like writing them!

> I assume that the code is smart and that my mental simulation of the code, which my brain helpfully and invisibly patches on the fly, is correct.

I try not to assume things until I've constructed associated proofs in my mind (and sometimes written them into comments). In fact keeping in mind what you've established (proven) and what not is a very important thing. Most of the bugs I've done are because I've simply forgotten to think about / prove something.

It's a completely different way of programming!

> Then I make those explicit statements executable. And then I run them frequently.

But I prefer to write down these explicit statements in the code itself, often as assertions. I can then prove them right on the spot!

You know, I used to be like that. Then I had the revelation that in the time it took to click test something 4 or 5 times I could write test code that could click test in less than 100ms forever.
> Play around with the program intelignetly and observe no bugs

That's the most sophomore thing I have read in a long time!!!

Probably works OK if you are working alone in a simple product whose whole code mostly fits inside a single brain. Try doing that as part of a team of dozens that make daily changes to a code base of millions of lines and you will very soon earn the title of the most infamous person in the office.

BTW, can you elaborate on the meaning of "sophomore"? I can't find in a dictionary anything other than "second in a series" which doesn't seem to make sense in your use.
Sophomore is the name of the second year in American high schools, so he's basically calling it an comment someone inexperienced would make
Right. In large projects some kind of automated tests are indeed useful. But they should not be a substitute for writing working code in the first place, just as an extra level of quality assurance.
Just compare web design before ACID and after ACID. All browsers had working code, but web design was nightmare.
How you prove to your peer that you done all of above? How your code is able to pass peer reviews??
Well, I haven't claimed to be able to to that, but to write mostly bugless code :) Though, writing insightful comments in the code helps. If anyone has doubt, you explain in a friendly way why they're wrong (if they are). Though, not all reviewers will study the code suffciently to find possible bugs, and also many findings are not the right/wrong kind but more about style/architecture.
Since you're actually writing tests (you say you're doing checks, which are effectively tests.) Why not just add them to a test suite?

You win twice, or more.

You get proof of the assertions you're making to your peers, you get regression tests to cover your code when it's being refactored, you also get to strongly document the intent of your code so that others can know it deeply, relatively quickly.

See devs like yourself who claim they aren't writing tests, but in essence they are, the only difference is they're not persisting their tests, and losing their value beyond initial validation of the correctness of the code.

I also doing all of that AND test cases. However, I never use debugger. :-/
Your performance as programmer is poor, but as QA you are excellent.
You mean me personally? What makes you think so?
Because you said that you can test program manually faster than write automated test case for it. I'm opposite. Human must think, computer must work.
I can write tests. I just prefer not to, for the sake of productivity and preserving sanity. If you disagree that I can write mostly bugless code, I challenge you to find some bugs in my open-source projects :)

https://github.com/ambrop72/aprinter

https://github.com/ambrop72/badvpn

Your code will not pass my peer review. ;-)
You can also write "Methodology X doesn't work always". All methodologies work well for some situation and for others they don't. In my view TDD is great for a lot simple things and algorithms and you can structure your code in a way that most of the code is inherently testable. But when things are so complex that you don't even know the correct architecture upfront, TDD is a killer.
If things are that complex, sounds like you need to be doing some discovery work (spikes) first to break the problem down. Then you can use TDD again :-D. So, I guess you're right - if you don't know wtf you're supposed to be doing, TDD is a killer. But then, so is anything else.
That blog post could pretty much apply the same arguments to itself. And who knows if Bob's experience is simply correlation not causation. Perhaps Bob is just a smart, meticulous engineer, and it wouldn't matter how he went about his dev work, the quality may be good regardless.
Test-first was promoted as being the secret sauce that made TDD so much better than anything else, so this is something of a qualified vindication, but I do think (from my own experience) that writing down what I am thinking does help me see flaws that I had overlooked.
good article; would like to add that a study based on "21 graduate students" is hardly representative of the software developer population...
(to nobody in particular)

Please the article before commenting.

It's not totally clear from reading some of the comments that people have actually read the article.

It's a good one, please do.