Hacker News new | ask | show | jobs
by 13of40 1254 days ago
I have a lot of bitter things to say about automated testing, having spent 14 years of my life trying to knead it into a legitimate profession, but here's the most significant:

You test case is more useless than a turd in the middle of the dining room table unless you put a comment in front of it that explains what it assumes, what it attempts, and what you expect to happen as a result.

Because if you just throw in some code, you're only giving the poor bastard investigating it two puzzles to debug instead of one.

9 comments

At an old job, one manager would put in his employees' annual reports stuff like "Developer X wrote N automated tests, fixed M bugs, and filed P new bugs this quarter..."

The obvious result of Goodhart's Law ensued, leading to test cases like you mention.

Lesson to leaders: Please stop your bad managers from pulling stupid crap like this. It wastes a lot more time in the longer run.

Which is funny as the purpose of testing is to explain to other other developers what the code under test assumes and what should be expected of it under various conditions. It is documentation.

If you have to document your documentation, you might be missing something fundamental in how you are writing your first order documentation. Not to mention that in doing so you defeat the reason for writing your documentation in an executable form (to be able to automatically validate that the documentation is true).

So i understand correctly, that your position is "code is the documentation"?

Over time im inclined to value human written documentation. Especially when things involve integrations of multiple systems. I had real cases, where two parties point at code and say their code is correct. And in isolation code looks correct. But when time comes to integrate these systems. It breaks. And then if you have human readable document where intentions and expectations are specified it's much easier to come to common (working) solution.

Not all languages have capability to express complex intentions so code as documentation does not work most of the time.

Code as documentation feels like a good idea because code is the only reliable source of truth. But it also assumes that code can comprehensively express all assumptions and other info, which sounds more like wishful thinking.

Auto-generated API docs combined with handwritten documentation that covers what can't be expressed in code and includes some useful examples seems like the right approach to me. In practice that's the kind of doc I tend to have the best experience with. For example the Rust stdlib docs are auto-generated but the language also supports notes and (automatically unit-tested) examples in docstrings which means the API docs are filled with explanations & examples and mentions what assumptions are made about inputs.

I built this framework coz while I didn't believe in "code as documentation" I did believe that example based specifications, tests and documentation were all sort of the same thing (triality):

https://hitchdev.com/hitchstory/

The difference between this and behave/cucumber is that the A) specification language allows for more complex representations and B) there's a templating step to generate readable documentation.

I'm not sure if you're saying that rust stdlib docs do this but documentation where all the examples are themselves runnable as tests and included in the CI test suite solves so many problems.
Not just the stdlib, the documentation tool itself supports this, so it's the case for any Rust package that writes documentation.
Same. I’m sick of people escaping writing documentation by saying that "code is the doc" and in the meantime, writing unreadable code abstracted over dozens of code files.

They almost convinced me somewhere in my career. But the hard truth I learnt is that most people are saying this because they aren’t capable of verbalizing what they are programming.

If your "code is doc", it should be extremely easy to add a little sentence above your method to explain what it does. And no, doc doesn’t stale. If your documentation isn’t up to what your function does, it’s probably because you should have written a brand new function instead of changing a function’s behavior.

The assumption in this that doesn’t fit my experience is that it assumes that someone that writes unreadable code abstracted over dozens of files, is going to be able to write clear, expressive and complete documentation.

In my experience if they can do the latter the former isn’t a problem. But since many people can’t you are left with bad code, littered with bad (often contradictory) comments which makes the problem worse not better.

> I’m sick of people escaping writing documentation

First level of documentation would be specifications. Which can then be used to write tests.

But in many, many shops "Agile" means "we don't do specs anymore, woohoo!".

> But the hard truth I learnt is that most people are saying this because they aren’t capable of verbalizing what they are programming.

I completely agree with you, ie right now doing bunch of data migration code that is awful 200 lines on first look, but does quite clever transformations, handles various data corner cases, manages lots of threads, is already quite optimized (had 30x speed increase just over last week's state and not yet done with it) etc. and... is full of little green one-liners explaining why certain logic is happening, why at given place, and not elsewhere, and how it helps later in the code.

Its even one-off migration, and its mostly for me only. But I still put comments in, have enough experience to know I will keep using those comments in further optimizations, and I know by heart that many one-off efforts end up being re-used later. Code dense with logic shouldn't require you to re-read it all to have constant full mental model of it and all its branching and possibilities just because you want to tweak it a bit.

The important point is to evolve those comments with code, otherwise they become worse than no comments at all. This is where most folks hit the wall - they are simply too lazy or undisciplined for that.

> They almost convinced me somewhere in my career. But the hard truth I learnt is that most people are saying this because they aren’t capable of verbalizing what they are programming.

I'd say that's true, and it's worth noting at this point that expressing certain things in natural language is hard. The strict rules of programming languages mean that you can reason about programs to a complexity level that would otherwise be unreachable. Notation as a tool of thought. The corollary is that there may not be a simple natural language equivalent of the code you're writing, and that adding documentation might be more effort than it's worth.

Not writing docs is a sort of hazing. Also job security.
Code is documentation, but it only tells you a part of the story. Good comments can explain why, but without writing comment essays it's usually not sufficient.

And, as you note, when integrating systems you need more than just the code and comments, since the code might not even be written with the other system in mind.

I think they mean "test code is documentation". For example if there's a unit test that expects an error for a certain input, it serves as documentation that this kind of input is not allowed.

It not always feasible to document every little edge case in natural language and keep it in sync with your code. If you "document" edge cases as tests, they _have_ to be in sync with your code. It shouldn't replace traditional documentation though and is better suited for internal components and not for public API.

No. Your documentation is your documentation. It documents your code. It is not your code.

If the documentation can also be interpreted by machine to validate what it claims is true you have a nice side benefit, but not the reason for writing your documentation.

Disregarding the "code is doc" position, it's still common to have an overview or index for documentation, which points readers in the right direction instead of dumping pages of detailed docs on them.

Now, you could also have a well organized test suite that goes from most obvious to most detailed, split into sections for each use-case, but this sounds a lot more tedious than "write a one-line comment describing the unit test".

>the purpose of testing is to explain to other other developers what the code under test assumes and what should be expected of it under various conditions

No, the point of automated testing is to verify that what is under test behaves correctly and to be able to scale this verification cheaper than having humans do it. Documenting what it verifies and under what conditions is just a side effect.

That is a common falsehood. Testing does not verify that the code under test behaves correctly. It only verifies that the what the documentation asserts correctly matches what the code does. Indeed, enabling the machine to verify that the documentation is true is cheaper than having humans do it. Also less error prone. Humans are notoriously bad at keeping documentation properly up to date.
The test plan is the documentation. That people are cutting corners is unfortunate.

A test must be reproduceable. If it is not, is not a test.

>You test case is more useless than a turd in the middle of the dining room table unless you put a comment in front of it that explains what it assumes, what it attempts, and what you expect to happen as a result.

This is why I found Gherkin/Cucumber (and BDD in general) to be a total revelation when I first encountered it. No one should be writing tests any other way IMO.

https://cucumber.io/docs/gherkin/reference/

Gherkin/Cucumber reintroduce the very problem TDD/BDD was intended to solve: Documentation falling out of sync with the implementation.

The revelation of TDD, which was later rebranded as BDD to deal with the confusion that arose with other types of testing, was that if your documentation was also executable the machine could be used to prove that the documentation is true. The Gherkin/Cucumber themselves are not executable and require you to re-document the function in another language with no facilities to ensure that the two are consistent with each other.

If you are attentive enough to ensure that the documentation and the implementation are aligned, you may as well write it in plain English. It will give you all of the same benefits without the annoying syntax.

TDD wasn't rebranded BDD.

BDD is a QA concern, primarily used for QA tests against a written (BDD) requirement.

TDD is about unit testing, which is about testing the implementation BY developers FOR other developers.

TDD says nothing about the correctness of the software against a spec, only that a given implementation aligns with a developer's intention.

If unit tests are not testing the behaviour, it's being done wrong.

If they are, the only difference between TDD and BDD is where, in which form, and by whom is that behaviour defined.

Unit tests assert implementation behaviour to aid refactoring. If developers misunderstand the spec, the unit tests can be valid. They don't assert developer understanding.

Say it with me, unit tests are to aid refactoring.

If we mix QA and implementation details just because both sides use the word "test" it ends in trouble.

QA should be blind to unit test coverage or even usage at all, they're totally independent concerns.

A passing unit test says nothing against correctness of product against a spec or design... only that it works and continues to work as a developer intended, to aid the work of future developers, even if they misunderstood the spec.

Your comment is at the core of why QA is a total mess. Everyone is confused about what "test" means in different contexts.

Why have a QA function at all with 100% unit test coverage? Because the unit tests may encode misunderstanding by developers. They're there to fight entropy, not wrongness.

QA, using BDD and other tools, ensure the product is correct, regardless of how well it fights entropy by unit tests.

Unit tests (class or method as the unit) hinder refactoring by binding to low level implementation details. When you refactor, by definition you are changing what the factors (units) are. Generally, your unit tests will then be testing implementation details that no longer exist. By strongly coypling to implementation details, unit test suites suffer an extremely large ripple effect on refactoring.

Tests in general can only help you refactor code at a lower level of granularity than what you are testing. Something lower than unit level is a contradiction.

Of course, you can instead test business behavior which isn't as volatile in refactoring and change your definition of unit to be a unit of practical business requirements...

>Your comment is at the core of why QA is a total mess. Everyone is confused about what "test" means in different contexts.

Particularly the important differences between unit, integration, and e2e testing. Many people use the words interchangeably when they are completely different concerns.

Unit test your library code that has no external dependencies, integration test your classes that implement those libraries, and e2e test your application that is built with those classes. There are varying philosophies to which are more useful, but it's an important distinction to maintain in terminology.

This sounds like a good theory but the practice of it is really hard. Pretty quickly you end up with tests that "say" one thing but have nuanced different behavior in the underlying implementation.

Then try to debug a "document"...

I like the idea. But having tried it at scale, it becomes a mess. Code I can understand. I can read English comments. I can't debug English.

We have it at scale, and no, it doesn't become a mess.

We use Spock, which make "comments" a very expected thing, which helps us not let tests without comments pass a code review.

Just use a tool that helps you and stop writing stupid tests whose impl code looks worse than the code being tested.

Then why all the comments?

I know what typical code does. This code looks simple but that's misleading when you're trying to understand a failure. You want consistency and clarity. You want readablity like code is readable not like a book is readable.

I agree. One nice feature of property-driven testing is that assumptions often end up causing test failures. For example (in ScalaTest):

  "Average of list" should "be within range" in {
    forAll() {
      (l: List[Float]) => {
        val avg = l.average
        assert(avg >= l.min && avg <= l.max)
      }
    }
This test will fail, since it doesn't hold for e.g. empty lists. Requiring non-empty lists will still fail, if we have awkward values like NaNs, etc. The following version has a better chance of passing:

  "Average of list" should "be within range" in {
    forAll() {
      (raw: List[Float]) => {
        val l = raw.filter(n => !n.isNaN && !n.isInfinite)
        whenever (l.nonEmpty) {
          val avg = l.average
          assert(avg >= l.min && avg <= l.max)
        }
      }
    }
Getting this test to pass required us to make those assumptions explicit. Of course, it doesn't spot everything; here's an article which explores this example in more depth (in Python) https://hypothesis.works/articles/calculating-the-mean
I always use (if the scenario is simple enough, which most are):

@Test

public void myTestMethod_Scenario_ShouldReturnThis() {....

Jest makes this far more straightforward.

It(“throws when the object belongs to another user”)

It(“does a business thing when thing is in state BLAH”)

To some degree, this is what BDD attempts to solve, separation of test mechanics and documentation of the test's intention.

I don't think it quite does it right, but it is of note.

We have a policy of making each test a spec. That is, a test requires a plain text spec to be attached to it in its doc string. It's kind of like BDD but without all the weird DSLs.
What about data driven tests where you lay out several variants, including edge cases, for function arguments? Seems pretty clear to me.
So, "given when then" style tests (e.g. Spock) plus a descriptive test name. Or more than that?
I suppose soon you could ask GPT "what is this code supposed to do?"

(I would buy a Copilot subscription for this)

Which would replace all those humans producing perfectly valid sounding explanations that if you invest some research effort have no basis in (the usually far more complex, but also far more fascinating and infinitely deep) reality. So yes, I think AI can indeed replace lots of human-produced thoughts :-)

I admit to have been guilty of this myself. I have a famous anecdote-example where I had a very well-paid contractor job and explained something about how my then department's software worked to someone from another department. I think I must have sounded very convincing, the person went off to change something in how they used our stuff. A few minutes later, after accidentally meeting and casually chatting with my boss for that job I realized everything I had said was total garbage. I quickly excused myself from my boss and hurried after the person to tell them to forget and ignore everything I had just explained to them because it was all wrong. I think this last step is not what happens in those cases because we don't usually realize that such a thing just happened.

The brain, or parts of it, are great at producing "explanations". I think that it was part of the more established and reproducible results of psychology that our brain first decides and acts, and only then produces some (often bullshit) "reason" when/if our conscious self asks for one? Does anybody remember if this is true and has a link?

>The brain, or parts of it, are great at producing "explanations". I think that it was part of the more established and reproducible results of psychology that our brain first decides and acts, and only then produces some (often bullshit) "reason" when/if our conscious self asks for one? Does anybody remember if this is true and has a link?

Relevant are Sperry & Gazzaniga's split brain experiments. Participants of these experiments had had their corpus callosum (one of the major "information" pathways between our brain's two halves) cut. This was an operation performed to keep epileptic seizures in check.

https://en.wikipedia.org/wiki/Split-brain

In these participants, specific brain "functions" such as speech were highly lateralized, meaning only one half of the brain was able to perform it to a satisfying degree.

Note that these were already not neuro-typical people prior to the experiments (given the regular, debilitating epileptic seizures), so reaching general conclusions from these experiments is hard.

Remember also that, like our brains, our bodies are also highly lateralized, such that the right-half of our brain controls the left-side of our body, and the left-half of the brain controls the right-side of our body. If you ever wanted proof against intelligent design, the way our brain connects to our eyes & body is one very strong argument..

Anyway, one experiment stands to mind where one half of the brain was instructed to perform some action (move the left arm, or something similar). Then the other half would be asked _why_ that arm was just moved. It would confabulate, on the spot, totally legit, but obviously bullshit, sounding reasoning. E.g. "I felt cold so I wanted to put on a coat", rather than "the experimenter instructed me to move it".

So, rather than claiming "I don't know", it would just make up a plausible reasoning. It is really unimaginable to _not_ know why you moved your arm..