| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by the_sleaze9 911 days ago

Good story.

I for one do not believe in Unit Tests and try to get LLM tooling to write them for me as much as possible.

Integration Tests however, (which I would argue is what this story is actually praising) are _critical components of professional software. Cypress has been my constant companion and better half these last few years.

3 comments

HideousKojima 911 days ago

Unit tests are useful for:

1) Cases where you have some sort of predefined specification that your code needs to conform to

2) Weird edge cases

3) Preventing reintroducing known bugs

In actual practice, about 99% of unit tests I see amount to "verifying that our code does what our code does" and are a useless waste of time and effort.

jkubicek 911 days ago

> In actual practice, about 99% of unit tests I see amount to "verifying that our code does what our code does" and are a useless waste of time and effort.

If you rephrase this as, "verifying that our code does what it did yesterday" these types of tests are useful. When I'm trying to add tests to previously untested code, this is usually how I start.

    1. Method outputs a big blob of JSON
    2. Write test to ensure that the output blob is always the same
    3. As you make changes, refine the test to be more focused and actionable

DontchaKnowit 910 days ago

The problem with this for me is that most of the time "verifying that our ccode does what it did yesterday" is not a useful condition : if you make no change to code, its going to do what it did yesterday. If you do make a change to the code, then you are probably intending for it to do something different, so now you have to change the test accordingly. It usually just means you have to make the same change in 2 different spots for every piece of unit-tested code you want to change.

jkubicek 910 days ago

> If you do make a change to the code, then you are probably intending for it to do something different, so now you have to change the test accordingly. It usually just means you have to make the same change in 2 different spots for every piece of unit-tested code you want to change.

Sure, but that's how unit-tested code works in general.

randomdata 910 days ago

> then you are probably intending for it to do something different

If you have decided that your software is going to do something different, you probably want to deprecate the legacy functionality to give the users some time to adapt, not change how things work from beneath them. If you eventually remove what is deprecated, the tests can be deleted along with it. There should be no need for them to change except maybe in extreme circumstances (e.g. a feature under test has a security vulnerability that necessitates a breaking change).

If you are testing internal implementation details, where things are likely to change often... Don't do that. It's not particularly useful. Test as if you are the user. That is what you want to be consistent and well documented.

pixl97 910 days ago

Then think of the unit test as the safety interlock.

HideousKojima 910 days ago

I had to migrate some ancient VB.NET code to .NET 6+ and C#. The code outputs a text file, and I needed to nake sure the new output matched the old output. I could have written some sort of test program that would have been roughly equal in length to what I was rewriting to verify that any change I made didn't affect the output, and to verify that the internal data was the same at each stage. Or... I could just output the internal state st various points and the final output to files and compare them directly. I chose the latter, and it saved me far more work than writing tests.

If I need to verify that my code works the same as it did yesterday, I can just compare the output of today's code to the output of yesterday's code.

jkubicek 910 days ago

I see two advantages in creating tests to check output

    1. You did the work to generate consistent output from the code as a whole, plus output intermediate steps. Writing those into a test lets future folks make use of the same tests.
    2. Having the tests in place prevents people from making changes that accidentally change the output

Don't get me wrong, tests that just compare two large blobs of output aren't fun to work with, but they _can_ be useful, and are an OK intermediate stage while you get proper unit tests written.

tshaddox 911 days ago

> In actual practice, about 99% of unit tests I see amount to "verifying that our code does what our code does"

That’s my experience too, especially for things like React components. I see a lot of unit tests that literally have almost the exact same code as the function they’re testing.

jay_kyburz 910 days ago

I've found that often find that a little bit of code that helps you observe that your code is working correctly is easier than checking that you code is working in the UI. The tests are a great place to store and easily run that code.

throwaway2037 910 days ago

3) Preventing reintroducing known bugs

When I was learning unit testing, my mentor taught me this strategy when fixing production bugs. First, write the unit test to demonstrate the bug. Second, fix the bug.

__MatrixMan__ 910 days ago

That's what you get when you don't write the tests first.

HideousKojima 910 days ago

That's just doubling your work. If you don't already have a spec, your unit tests and actual code are essentially the same code, just written twice.

__MatrixMan__ 910 days ago

Determining which states are authentically hazardous and mocking data and adjacent services to make those states accessible at the press of a button is definitely not the same as writing code which handles those states appropriately.

__MatrixMan__ 911 days ago

You should try switching it up. Write the tests and then ask the LLM to write the code that makes them pass. I find I'm more likely to learn something in this mode.

makeitdouble 910 days ago

I'd argue having useable LLMs kind of brings out how problematic TDD is.

Imagine the dumbest function you have to write: a product A and a street address as input, and the shipping cost as an output.

How many test cases would you write to be absolutely sure that function actually does what you want it to do, and be confident it doesn't have weird exceptions that the LLM injected randomly ? I'd assume you'd still vet the code written by the LLM, but if it's hundreds of rambling lines doing weird stuff to get the right result, is it really faster than writing it yourself ?

__MatrixMan__ 910 days ago

If it's hundreds of rambling lines then I'm not going to be able to get it past my linter anyhow (complexity thresholds), nor am I going to be able to get it past my team when they review it. So yeah, that's a problematic case, but it's one I'm going to have to refactor to avoid with or without an LLM in the loop.

throwaway2037 910 days ago

About the problems of TDD: Cedric Beust has a legendary blog post about it here: https://www.beust.com/weblog/the-pitfalls-of-test-driven-dev...

MoreQARespect 910 days ago

TDD works best if you default to testing at the outer shell of the app - e.g. translating a user story into steps executed by playwright against your web app and only TDDing lower layers once youve used those higher level tests to evolve a useful abstraction underneath the outer shell.

It seems to be taught in a fucked up way though where you imagine you want a car object and a banana object and you want to insert the banana into a car or some other kind of abstract nonsense.

qup 910 days ago

How effective is the LLM when used this way, compared to normally?

__MatrixMan__ 910 days ago

I don't know what normally is, but I'd say it works pretty well.

Often the challenge is that the context for what you're trying to do is sprawling. There's just too many files and they're all too long: you end up exceeding the context window or filling it with 99% irrelevant stuff. Typically the structures you build for tests are smaller and more focused on the particular instance you're worried about, which I think is a better way to talk to an LLVM.

You don't have to explain, for instance, that there's data in production which doesn't match the schema in the code so it must be cautious to avoid running afoul of that difference. Instead you've mocked that data, so it's right there in the same code with the test that it's trying to make pass.

randomdata 911 days ago

In reality, unit tests and integration tests are different names for the same thing. All attempts at post facto differentiation fall flat.

For example, the first result on Google states that a unit test calls one function, while an integration test may call a set of functions. But as soon as you have a function that has side effects, then it will be necessary to call other functions to observe the change in state. There is nothing communicated by calling this an integration test rather than a unit test. The intent of the test is identical.

cjfd 911 days ago

No. Or maybe only if you also consider 'village' and 'city' to be the same thing.

pjc50 911 days ago

That's a good example, because while they're clearly different things, any distinction you draw between them such as "population > 100k" or "has cathedral" is always going to be a bit arbitrary, and many cities grew organically from villages in an unplanned manner.

randomdata 910 days ago

Is it? Kent Beck, coiner of unit test, made himself quite clear that a unit test is a test that is independent (i.e. doesn't cause other tests to fail). For all the ridiculous definitions I have come across, I have never once heard anyone call an integration test a test that is dependent (i.e. may cause other tests to fail). In reality, a unit test and an integration test are the same thing.

The post facto attempts at differentiation never make sense. For example, another comment here proposed that a unit test is that which is not dependent on externally mutable dependencies (e.g. the filesystem). But Beck has always been adamant that unit tests should use the "real thing" to the greatest extent possible, including using the filesystem if that's what your application does.

Now, if one test mutates the filesystem in a way that breaks another test, that would violate what Beck calls a unit test. This is probably the source of confusion in the above. Naturally, if you don't touch the file system there is no risk of conflicting with other tests also using the filesystem. But that really misses the point.

__MatrixMan__ 911 days ago

There are only two kinds of tests: ones you need and ones you don't. Splitting hairs over names of types of tests is only useful if you're trying to pad a resume.

WendyTheWillow 911 days ago

Clusters of humans cohabiting a confined space? If you squint hard enough…

randomdata 911 days ago

Implying that integration tests (or vice versa) are legally incorporated like cities, while unit tests are not? What value is there in recognizing a test as a legal entity? Does the, assuming US, legal system even allow incorporation of code? Frankly, I don't think your comparison works.

rileymat2 911 days ago

I think he is not implying a hard line legal standard but as connections and size increase different properties start to emerge humans start to differentiate things based on that, but there is a gradient so we can find examples that are hard to classify.

randomdata 911 days ago

What differentiates a city from a village is legal status, not size. If size means population, there are cities with 400 inhabitants, villages with 30,000 inhabitants, and vice versa. It is not clear how this pertains to tests.

When unit test was coined, it referred to a test that is isolated from other tests. Integration tests are also isolated from other tests. There is no difference. Again, the post facto attempts to differentiate them all fall flat, pointing to things that have no relevance.

drewcoo 910 days ago

> What differentiates a city from a village is legal status, not size

Fine. And legal status depends on location. There are many localities.

troupo 911 days ago

You should not be downvoted as heavily as you are now.

I feel like we did testing a disservice by specifying the unit to be too granular. So in most systems you end up with hundreds of useless tests testing very specific parts of code in complete isolation.

In my opinion a unit should be a "full unit of functionality as observed by the user of the system". What most people call integration tests. Instead of testing N similar scenarios for M separate units of code, giving you NxM tests, write N integrations tests that will test those for all of your units of code, and will find bugs where those units, well, integrate.