Hacker News new | ask | show | jobs
by aliencat 1192 days ago
unit test definitely a great use of copilot.
2 comments

What are you talking about?

I don't know anyone who uses it for this; specifically tests are really bad if they're subtly wrong.

Maybe to scaffold the test function, but the actual test if completely useless if you don't trust it.

So like... generate code and have robust tests, or write robust code... but, it's really really daft to generate tests that might hallucinate some random crap (and copilot really does sometimes).

Other people have said this, but copilot is auto complete.

You use it every time you press tab.

Do you accept an auto-complete suggestion 40% of the time? ...mmmm, yes, well, guess why 40% of code is generated by copilot.

It's not because people are generating tests with it.

In a recent personal C project (a custom archive file extractor), I have used chatgpt to generate tons of unit tests.

And honestly, I was extremely impressed. With a bit of context, it was able to generate almost correct test data for a fairly complex binary data format.

I've also leverage it heavily to:

* generate doxygen comments

* get usage examples of libs I never used

* create a whole bunch of utils functions.

And honestly for all these tedious tasks, it has done a far better job than I would ever had.

Comments were consistent in styling, the base examples were fairly good, and the utils functions were well made, specially the error handling (which I would probably have semi-consciously skipped tbh).

In fairness, I modified most of this code slightly to make it fit my project structure, or tweaked it a bit what the IA didn't quite get it right.

(it never fully understood some offset fields in the file format, but got pretty close to at times. And in fairness my naming for these offset fields was a bit questionable).

Heck, as a test, I even threw at it an RFC-like spec of the file format, and asked him to generate a python parser. The result was not 100% correct, but definitely a good start to iterate on.

In the end, this side project took me 2 weeks to implement with chatgpt probably saving around 1 week of dev. It also greatly helped improving the quality of the project (better doc, better tests).

> I don't know anyone who uses it for this

that would be surprising - it does a fantastic job of producing e.g. dumb unit tests. for instance, Copilot + a Go table testing template means you can churn out the simple "make a request to this url with this data, ensure I get a 200 and the response contains 'id' and not 'error'" extremely quickly. the code is trivial but tedious, so you can quickly inspect for sanity and run to ensure they pass, then commit and have them checking future changes.

> it's really really daft to generate tests that might hallucinate some random crap (and copilot really does sometimes).

variations on this comment are all over these threads, which is bizarre. hallucinating means you waste a few seconds reading the code it produced, not that you commit incorrect code.

this isn't like ChatGPT advising people to drink bleach, Copilot is a dumb tool offering an expert (you) suggested solutions for your expert consideration

Most of the time I know exactly what tests I want to write but it's annoying to write them. Copilot autocompletes, I check it and then I use it.
For me the worst part about this is that writing tests tediously makes me reach for layers of abstraction on my test code. And then suddenly my test code is complicated and needs its own test code, and changing a test can often be problematic to the abstractions I foolishly employed.

Being able to churn out the boilerplate for tests, which can make DRY a non-concern, is great. I just hope that if I do this, I never get sloppy, and I always review the tests.

I agree! I try not to make test code too complicated, and prefer repetitive code over abstractions (else you have to test the tests!) so Copilot is useful there.
In my case about 50% of the test code is just boilerplate, so I usually type the test name and generate the rest. Most of the time I have to rewrite the actual test logic (although very trivial tests are sometimes correct), but I probably keep more than 40% if we're talking tests specifically.
> In my case about 50% of the test code is just boilerplate

Why not replace it with a function call?

A common occurrence is something like a pool of 10 actions where a bunch of tests each do 3 to 7 of them. This is very hard to abstract with a function call.
That makes the test harder to read. Tests should be dumb.
In the case I was referring to, mainly to keep the code consistent with what was already there and the PRs small, as the code base is primarily owned by a different team. It's also pretty innocuous short tests that read well as they are.
Some languages are just noisy.

Go, Java I’ve experienced are noisier than others.

I use it for unit tests. That's one of its best use cases. Obviously I read the code it writes down.
I guess this could be true in some languages and settings, for example when a notable portion of your tests verify correct behavior with null values, arguments of wrong type, etc. When talking about unit tests that verify the actual logic, you should naturally become more careful with your copilot suggestions. Especially if copilot also wrote the code it's testing.