| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by mvdtnz 605 days ago
	> We don't expect human developers to be perfect, why should we expect AI assistants. What absolute nonsense. What an absurd false equivalence. It's not that we expect perfection or even human level performance from "AI". It's that the crap that comes out of LLMs is not even at the level of a first year student. I've never in my entire life reviewed the code of a junior engineer and seen them invent third party APIs from whole cloth. I've never had a junior send me code that generates a payload that doesn't validate at the first layer of the operation with zero manual testing to check it. No junior has ever asked me to review a pull request containing references to an open source framework that doesn't exist anywhere in my application. Yet these scenarios are commonplace in "AI" generated code.

1 comments

simonw 605 days ago

That problem genuinely doesn't matter to me at all.

If an LLM hallucinates a method that doesn't exist I find out the moment I try and run the code.

If I'm using ChatGPT Code Interpreter (for Python) or Claude analysis mode (for JavaScript) I don't even have to intervene: the LLM can run in a loop, generating code, testing that it executes without errors and correcting any mistakes it makes.

I still need to carefully review the code, but the mistakes which cause it not to run at all are by far the least amount of work to identify.

link

mvdtnz 605 days ago

Yes I've seen the dreck you produce with LLMs. Not a shining endorsement in my eyes.

https://news.ycombinator.com/item?id=41929174

link

simonw 605 days ago

Which of those did you think were dreck?

I think the source code for tools like this one is genuinely good code: https://github.com/simonw/tools/blob/main/extract-urls.html

What do you see that's wrong with that?

link

mvdtnz 605 days ago

It's a toy. It doesn't do useful work. The code is fine for the pathetically small sample but that coding style does not scale to real software scales.

link

sdesol 604 days ago

> style does not scale to real software scales

I think those that dismiss AI completely will fall behind, and those that turn it into a crutch will pay for it in the years to come. I truly believe AI is game changing, as I used it to create standalone functions and get answers that saved me a day or two of research and reading. I've never worked with the cheerio library before but it answered everything I needed to know, among other things. It wasn't perfect though, as it (can't remember the model) wasted some time for me regarding the SQLite library for Node.js.

I think the issue we have right now is we are treating LLM as a final solution (mainly due to investors) instead of thinking of it as a new interface, with quirks that cannot be taken lightly. It's a bit extreme, but I think junior developers should not be allowed to use LLM. LLM is a Power Tool for developers that can easily spot BS and/or have the confidence and knowledge to fix BS that is missed.

link

simonw 604 days ago

The purpose of my "14 things I built in the last week" post was not to demonstrate large software - it was to show how the cost of building small applications has effectively fallen close to zero for me.

I can knock out small but useful applications in genuinely less time than it would take me to Google for an existing solution to the same problem.

You can call them dreck if you like. I call (most of) them useful solutions.

link

NitpickLawyer 604 days ago

|_____| <- dreck code

... ... ... |_____| <- it's good code, but toy problem

I guess we all see where the goalposts will be tomorrow. Good code, good problem, I don't like the language. Or something :)

link

mvdtnz 604 days ago

"Dreck" means worthless rubbish. Code that solves useless toy problems is worthless rubbish.

link