| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by indiv0 1 hour ago

This thread will become a typical "haha slop company made slop" but I've been bitten by a bug exactly like this before in a (pre-AI, artisan) OSS project. The maintainer there didn't properly account for DST when calculating last backup time, so the app started and never stopped writing/re-writing backups continuously.

Perhaps the framing shouldn't be "haha slop" but rather why doesn't the AI write better quality software than we do? To which the answer is obvious IMO -- even emergent properties can't elevate AI intelligence too far above the training dataset. So how do we get to superintelligent (or at least "not-wreck-your-NVMe-endurance-telligent") AI, if we, as a whole, are not smart enough ourselves?

Judge not the slop-bot, lest ye be judged yourself, engineer.

4 comments

sleples 1 hour ago

We've gone from "you're holding it wrong" to "the training data was bad because humans suck too". Difference is, humans learn from their mistakes.

link

SilverSlash 50 minutes ago

> Difference is, humans learn from their mistakes.

Great! So next time the human will prompt the agent to watch out for and avoid this bug.

link

ponector 33 minutes ago

You are a senior developer. Please do no mistakes!

link

xpct 18 minutes ago

Lack of accountability is the cause here. People don't think before hitting the 'Publish' button. Their managers let them off the hook because the culture still allows making egregious mistakes, as long as there's an LLM to blame.

link

applfanboysbgon 1 hour ago

1. I bet that developer only made that mistake one time in their life. Humans learn from their mistakes, LLMs don't. If you rely on LLMs to generate all of your code, you can expect to run into the same issues again and again.

2. "One developer somewhere in the world made a bad mistake one time, so this represents the quality of all software devs everywhere". Maybe they were just a bad developer? Bad developers exist. I have never written a bug that has destroyed my users' hardware, and I think that writing such a bug is completely inexcusable in an enterprise environment with software that will be shipped to millions of users, as Codex is.

link

matharmin 45 minutes ago

LLMs do learn from mistakes. Not as directly from individual mistakes like humans do, but in aggregate the models have improved much more in the last year than most humans I know learn in the same time.

link

xpct 11 minutes ago

I don't like the reframing of 'learning from mistakes' from a human-like, near instantaneous feedback loop, to a year-long process of retraining on many traces collected from user data. They're different concepts and we should refer to them using different phrasing.

link

lifthrasiir 1 hour ago

> I have never written a bug that has destroyed my users' hardware, ...

Probably whoever (human or agent) originally decided to put TRACE logs into SQLite also thought---or reasoned---so. Maybe the decision was right at that time but the amount of TRACE logs have increased enormously. You will never know.

link

applfanboysbgon 52 minutes ago

I love that we've moved the goalposts from "LLMs are better than artisanal software engineers" to "actually, shipping hardware-destroying bugs in production is literally unavoidable, nobody could possibly avoid doing it".

link

lifthrasiir 44 minutes ago

I only meant what I said. After all the OP's thesis was that LLMs aren't better than artisanal software engineers, are they? There was no goalpost to move at least in this particular thread. And the solution might be another agent monitoring those oft-ignored signals.

link

da_grift_shift 51 minutes ago

What are your thoughts on the SNR of the linked GitHub issue threads? Consider the volume of comments posted and the substance of each comment.

link

fn-mote 22 minutes ago

I read the first page and they were excellent. Each was clearly written by an experienced dev who knows how to substantiate their claims and propose an acceptable fix that could just be merged.

Your comment, on the other hand, would be improved by including your own opinion on the matter.

link