Hacker News new | ask | show | jobs
by nox101 586 days ago
Can you give some examples? What LLM? What code? What tests?

As a test I just asked "ChatGPT 4o with canvas" to "Can you write a set of tests to test glBufferData and all of its edge cases?"

glBufferData is a 32 year old API so there's clearly plenty of examples for to have looked it. There are even multiple public tests for it including the official tests that are open sources and so easily scannable. It failed

It wrote 8 tests, 7 of those tests were wrong in that it did something wrong intentionally then asserted it go no error. It wasn't close to comprehensive. It didn't test the function actually put data in the buffer for example, nor did it check the set of valid enums to see that they work. Nor did it check that the target parameter actually works and affects the correct buffer bound to that target.

This is my experience with LLMs for code so far. I do get answers quicker from LLMs sometimes for tech questions vs searching via Google and reading stack overflow. But that's only sometimes. As a recent example, I was trying to add TypeScript types some JavaScript and it failed. I went round and round tell it it failed but it got stuck in a loop and just kept saying "Oh, sorry. How about this -- repeat of previous code"

2 comments

If you asked me to write tests with such a vague definition I’d also have issues writing them though. It’ll work a lot better if you tell it what you want it to validate I think.
Wait, wait. You ought to write tests for javascript react html form validation boilerplate. Not that.

/s aside, it’s what we all experience too. There’s a great divide between programming pre-around-2015 and thereafter. LLMs can only do recent programming, which is a product of tons of money getting loaded into the industry and creating jobs that made no sense ten years ago. Basically, the more repetitive boilerplate patterns configuration options import blocks row-obj-dto-obj conversion typecheck bullshit you write per day, the more LLMs help. I mean, one could abstract all that away using regular programming, but how would they sell their work for $^6 an AI for $^9 then?

Just yesterday, after reading yet another “oh you must try again” comment, I asked 4o about how to stop puppeteer from dumping errors into console and exit gracefully when I close the headful browser (all logs and code provided). Right away it slided into nonsense. I always finish my chats with what I think about it uncut, just in case someone uses these for further learning.