Hacker News new | ask | show | jobs
by lgrapenthin 46 days ago
Indeed. Generated code is also harder to read because it violates all semantic expectations that rely on the mental model of a human author. A generated piece of code is linguistically plausible but often unknowingly imitates common idioms so incoherently that the actual bug may be accidentally disguised in a way no sane human (even a bad programmer) could have come up with.

Since LLMs have no internal evaluation, as a reviewer one has to account for it and evaluate line by line, rebuild from scratch any hidden rationale and tacit knowledge the LLM didn't have in the first place - only to be mislead into non concerns draining costly hours.

At this point, the investment is often deeper than writing from scratch.

2 comments

I tried to capture some of my feelings on this on a recent personal blog post/rant. The easiest phrase is that LLMs are "legacy code as a service". They are trained on other people's legacy code. (No one is intentionally feeding LLMs their best proprietary code.) They produce output that is "Day 1 Legacy Code" in the sense that there's no human code owner to take responsibility and you might be able to ask the LLM that built it questions, but it is easier to accept is as the LLM that wrote it is no longer at the company (between context/memory limitations and regular model upgrades/retrainings, etc).

But also, yeah, it starts to get worse than classic legacy code because you could try to build a theory of mind about the legacy code author(s). There were skills in trying to "mind read" a past generation. To find clues in poetry words more than the poetry form. (The variable names and whatever comments may have survived including commit logs; things written for humans to help explain the whys/hows, not just the whats.)

"legacy code as a service" - that's apt. But would they be better if they trained exclusively on 'good code'? I know I don't know the answer to that question and I get the feeling that few people actually understand how they work enough to feel comfortable with asserting that to be true.
Yeah, I still wouldn't trust them if they were training on more good code, either. I think I understand enough of how they work to believe that even given plenty of good code they won't be able to learn the parts that make good code truly good. That's where I start into poetry metaphors and that the best code is not just concerned with poetry forms (the rhythm and meter required by the language) nor the literal meaning (the compiler output) but also the human elements of the poem such as the creative storytelling and multiple levels of metaphors. I cannot see the current technology getting good at those human parts of the poetry, no matter how good they get at the literal and the form.
The problem there is the _large_ language model part, the density and the reinforcement of the weights. There's far less good code in the world. ;) These things emit code as well as I do, such as they do, only because they've inhaled essentially the totality of "code in general", not artisanal code.
> A generated piece of code is linguistically plausible but often unknowingly imitates common idioms so incoherently that the actual bug may be accidentally disguised in a way no sane human (even a bad programmer) could have come up with.

Came here to say this, but you said it for me. When you are an infinite code generator and your only parlour trick, your only hammer, is generation, and every nail is a problem of as-yet-insufficient generation, then generate you shall.

But the cognitive burden of metabolising this ultra-verbose, circuitous, often brute-force excreta is quite a bit higher than thumbing through a (competent) human's relatively terse approach.