Hacker News new | ask | show | jobs
by aspenmartin 30 days ago
> Handing over software quality to the stochastic code extruder is causing a sharp drop in the quality of software put out into the world.

Well, first of all you and the author point to the same derisive comment of these models being, in your words "stochastic code extruder" or the one I have heard a lot "next-token predictors", and the connotation I read from these being that this makes them inherently dumb or unintelligent and I don't understand that. The fact that these "stochastic code extruders" can solve Erdos problems is sort of the proof in the pudding. Next token prediction is profound in that it is _a very simple objective_ yet it is _enough_ to take you to extraordinary heights.

Also I wonder how many folks honestly look in the mirror and think: how does the median programmer differ from an LLM. Do you really think humans are universally better and produce universally higher quality code? Not even universally, I would say _typically_. I would trust an LLM to not write a buffer overflow far more than a junior or a mediocre senior engineer. LLMs have built things in my domain that are non-trivial and impressive and correct.

Not to mention, these systems are following a predictable trend in performance improvement so these worries about quality just won't age well, and it seems to be a head-in-the-sand attitude that pretends like quality and reliability are not getting very very good _already_.

> Shipping poor quality and user hostile software actually hurts people.

Could not agree more. So why do you think humans are inherently better at this?

> This “inevitable” slide into generative AI harms every single person it comes into contact with.

I just don't quite understand this, is it that: (1) agentic code is inherently inferior to human code and thus (2) shipping agentic code is actively harmful?

7 comments

It's like people complaining about "poor quality plastic trinkets" that replaced well-made household items. Of course high-quality things can be (and are) made of plastic. The problem is that you can still make a very cheap passable thing out of plastic, that would be uneconomical to make out of metal or wood.

Same with code: by using AI, one can produce passable software trinkets very cheaply, that would be uneconimical to produce by paying poor-quality human developers.

The floor has moved downwards, allowing to produce a flood of new, trash-quality, disposable code very cheaply. It does not mean that we'll have to use only that code. But unfortunately we'll have to live with it, too.

> It's like people complaining about "poor quality plastic trinkets" that replaced well-made household items.

Actually, people had to be made to see it as cheap so they would throw away and re-buy more.

https://desis.osu.edu/seniorthesis/index.php/2024/09/15/crea...

> “It was a really difficult sell to the American public in the post-war period, to inculcate people into a throwaway living,” she says. “That is not what people were used to.”

> A solution companies came up with was emphasizing that plastic was a low-cost, abundant material.

> A 1960 marketing study for Scott Cup said the containers were “almost indestructible,” but that the manufacturer could still convince people to discard them after a few uses. To counter any “pangs of conscience” consumers might feel about throwing them away, the researchers suggested a “direct attack”: Tell people the cups are cheap, they said, and that “there are more where these came from.”

> A few years later, Scott ran an advertisement saying its plastic cups were available at “‘toss-away prices.”

It wasn't plastic itself, and likewise it's not "AI" itself.

We do have an abysmal track record as industrialized nations however, and more recently, in many parts of the software industry.

But we can change it. With so many things, tech people spent so much time and energy debating... like cookies or HTTPS or whatever... we often heard/said that while we care so much about doing the right thing, we can't achieve anything at work because consumers don't care. Well, this time, pretty much all of the world cares a lot. I mean, the Vatican just blogged about it!

Maybe we just "have to live with it", but in that case, there is also no utility in pointing that out, since we literally have to live with it. And of course, it's really about the shape of the "it", and how it's used, not that there is one that will never go away. That is also true about most things: stuff we don't currently use is in the museum or text books. Nothing goes away away, but we no longer drink out of lead cups, even though we still use lead. We don't have x-ray machines in shoe shops, even though we still use x-rays.

I agree, "AI" is being pushed beyond reason, in an attempt to make it look like what it's not, or at least not yet. And yes, people do act irrationally due to that, see all that tkenmaxxing and layoffs. I don't think it will last in this inebriated state forever.

But it's worth noting that there's a new substance enabled by AI, the "slop", that is flushing violently into our world right now. Pretending it's not there, and that we won't see more of it, is perilous.

> The fact that these "stochastic code extruders" can solve Erdos problems is sort of the proof in the pudding.

This claim is very misleading and not really true. It reflects the kind of exaggeration and spin made by corporate marketing. I would not call this a fact at all. Like many claims made by for-profit marketing, if one looks into the details and think critically about what is being claimed, one can see that consumers are jumping to false conclusions.

That said, it is very cool how an LLM helped human mathematicians in the recent specific Erdos problem solution announced by OpenAI. Just don't jump to the conclusion that anybody can input any Erdos problem into an LLM and a solution will come out the other end.

> exaggeration and spin made by corporate marketing.

corporate marketing spins and hypes, but this is an ultimately pretty academic and mathematical field. The loud LinkedIn promoters are not building these systems.

"if one looks into the details and think critically about what is being claimed, one can see that consumers are jumping to false conclusions."

well then help us out here: can you be specific? To me it sounds a lot like goalpost moving. You're telling me that in 2020 if I showed you a system that can solve an Erdos problem or disprove a conjecture (just recently showed up) you wouldn't be blown away?

> That said, it is very cool how an LLM helped human mathematicians in the recent specific Erdos problem solution announced by OpenAI. Just don't jump to the conclusion that anybody can input any Erdos problem into an LLM and a solution will come out the other end.

Woah woah, that's not the conclusion I'm jumping to. That's not at all how these headlines happen. Solving problems like this is almost prohibitively expensive today, and they more often than not lead nowhere. The point I'm making is, today, 4 years since ChatGPT, we have systems that can and have solved them. First we had things like AIME and IMO benchmarks, then people said "well those are just cheats in the training data, wait for it to solve a real math problem" -- ok but now we're solving real math problems.

> well then help us out here: can you be specific?

In "Remarks on the disproof of the unit distance conjecture" (https://arxiv.org/abs/2605.20695) I think Melanie Matchett Wood's remark is the most informative: "It is easy to jump to hasty conclusions, but what we can learn about humans, AI, and mathematics from this development is somewhat subtle. I believe if the level and type of human expertise that is represented on this note had been assembled to find a counterexample to this conjecture a month ago, and those people put in similar amounts of time working on it than they did to reading and thinking about ChatGPT’s solution, the mathematicians would have found a counterexample. However, without the claimed proof by ChatGPT, there is no particular reason anyone would have tried to look for a counterexample, assembled a group of experts with the appropriate expertise, or that the experts would have agreed to turn their attention to this problem."

Some readers might find some of the other remarks more appealing or more informative. I encourage folks to read these remarks rather than the OpenAI marketing video and spin.

> To me it sounds a lot like goalpost moving. You're telling me that in 2020 if I showed you a system that can solve an Erdos problem or disprove a conjecture (just recently showed up) you wouldn't be blown away?

I'm not sure what goalpost you are talking about. Regarding 2020: it depends on the framing, how much I know about the conjecture, the details of the computer system. I an easily imagine not being blown away. But I don't really see the relevance of our emotional reactions to computers doing new things we've never seen computers do before. If the goalpost is being "blown away" by what computers can do, then that happened I think around 1990 when I heard a computer program generate a vaguely human sounding voice. In math, I think it happened when I saw Mathematica simplified a huge nasty complicated algebra expression around 2000. I've been "blown away" by new things computers can do many times over the past 36 years.

> Woah woah, that's not the conclusion I'm jumping to. ... ok but now we're solving real math problems.

Sounds like we are in agreement then that (1) LLMs can not solve any given Erdos problem and (2) computers are solving more real math problems than they were before.

I honestly do think what the OpenAI group did with an LLM recently is a new milestone worthy of attention if one is interested in computer assisted mathematics. I don't mean to diminish the LLM feat. I just mean to throw shade on the corporate marketing, language, slick video, and spin.

> Also I wonder how many folks honestly look in the mirror and think: how does the median programmer differ from an LLM.

Once you step out of pure-software orgs, it becomes clear that most would benefit from having AI write code. There's a huge moat between most people and the point where they can afford/find the effort of someone that can write software.

These people, that only care about practical results rather than somewhat tangential things like "elegance" and "maintainability", are going to benefit tremendously.

>> Shipping poor quality and user hostile software actually hurts people.

> Could not agree more. So why do you think humans are inherently better at this?

Because humans are capable of empathy

Given the UIs I've experienced over the years, I'd dispute that assumption...
Why is that a prerequisite? There are entire philosophies about what makes good design for UI's etc. Why can't models figure this out? Why do you feel this is some sort of mystical thing out of reach?
If you think all of the complexity of the human experience is reducible to statistical weights between tokens, that’s fine. Go with God. I don’t think that it is.
What do you think humans are? What’s the mechanism by which we make decisions, learn, remember, etc? Why would this be anything more than a very complex system that has been slowly optimized via evolution to perform well in its environment over long time horizons?
I can trust a human to understand what a buffer overflow is and to learn.
I'd trust a junior to tell me that the code might be buggy. I'd trust an LLM to bullshit me about the code being flawless.

I'd review the code written by both.

> to the same derisive comment of these models being, in your words "stochastic code extruder"

So many excited and insulted LLM adopters on this thread. There is nothing derisive in that comment, it is simply the purest possible definition of how they work. Stochastics is a branch of maths you know.

> can solve Erdos problems is sort of the proof in the pudding

For the non-engineer, non-mathematician it may sound authoritative, but you'd probably be surprised to learn that most of Erdos problems are not at all complex - they are just not very interesting or relevant. So it is a proof in the pudding, provided the pudding is made of shit - the kind of stuff LLMs produce most of the time.

> I just don't quite understand this, is it that: (1) agentic code is inherently inferior to human code and thus (2) shipping agentic code is actively harmful?

Yes and yes - have you not heard of that AWS incident with Kiro when the "agentic" shit deleted an entire infrastructure environment, complete with data, config, etc.?

> Also I wonder how many folks honestly look in the mirror and think: how does the median programmer differ from an LLM

Apart from the obvious absurdity of this statement - I know a lot of you non-engineer types feel "empowered" by the LLMs, in the sense of how they immediately seem a genius when you ask them on a topic you are not expert in, but you may want to read a book on programming first - maybe you'll get a clue then.

> So many excited and insulted LLM adopters on this thread.

neither excited nor insulted.

> There is nothing derisive in that comment, it is simply the purest possible definition of how they work. Stochastics is a branch of maths you know.

Not sure what you mean by stochastics but this is more statistics. They are trained with a next token loss, that doesn't belie "how they work".

> For the non-engineer, non-mathematician it may sound authoritative, but you'd probably be surprised to learn that most of Erdos problems are not at all complex - they are just not very interesting or relevant.

It sounds like you are both an engineer and a mathematician? Can you confirm? These are problems unsolved for many years. You think no good mathematicians have taken a stab at them, even if just to say they have resolved an unsolved Erdos problem? They are "not at all complex" is quite an extraordinary thing to say I'm wondering if you actually do have the pedigree you are trying to make it sound like you have, or if you are just regurgitating the same HN talking points you've heard.

> Yes and yes - have you not heard of that AWS incident with Kiro when the "agentic" shit deleted an entire infrastructure environment, complete with data, config, etc.?

And this means agentic code is inherently inferior to human code? Howso?

> Apart from the obvious absurdity of this statement - I know a lot of you non-engineer types feel "empowered" by the LLMs, in the sense of how they immediately seem a genius when you ask them on a topic you are not expert in, but you may want to read a book on programming first - maybe you'll get a clue then.

in the beginning you mentioned there were a lot of "excited and insulted LLM adopters" and yet...this sounds quite excited and defensive. Believe it or not, I am not a "non-engineer type" and its telling you assume that people who don't seem to share the same opinion as you are somehow less qualified than I assume you think you are? Why is this statement obviously absurd. Maybe you work in a really talented engineering team, which kudos to you I also have worked in teams like this, and I have also seen what is the p50 engineer and they are just as error prone or more than Claude. Thank you for the advice to read a book on programming as if that somehow would have any bearing on this at all?

> an engineer and a mathematician

An engineer with an engineering degree, which as it may still be known to some, requires a fairly stringent mathematical underpinning. So yes, I know a thing or two - read up on Erdos and his problems, I am not here to enlighten every vibecoding PM that shows up.

> And this means agentic code is inherently inferior to human code? Howso?

Again, I am not here to explain the world to some clueless PM. You have your LLMs for that :) But for the sake of bringing you closer, the "agentic" code is often very inferior, implementing happy paths or just bluntly exposing secrets in clear texts, etc. Probably a consequence of it being trained on, as you put it "p50 engineering code".

> Maybe you work in a really talented engineering team,

Running my own company and been paying the LLM-Shit-Generators for my whole team for a long time, in the hope they would bring the advertised benefits. Guess what - for serious use-cases, they bring shit and more shit.

> Thank you for the advice to read a book on programming as if that somehow would have any bearing on this at all?

Oh yeah obviously not, I mean, its not like understanding software development would help you understand how LLMs are not similar to a "p50 engineer" at all:). I'd take the latter over the former every time.

> Why is this statement obviously absurd

Well for one, LLMs are not humans, but it should be obvious to even to most cretinous of the e/acc crowd. It's not like they can think in abstract terms or come up with completely new concepts. But then again, don't mind me - if you can live with below average AI slop - go for it.

A really sincere piece of advice that I really hope you take to heart: everyone who disagrees with you is not simply beneath your genius. I am not a PM (yet that is also quite insulting to some very competent and technical PM's I have worked with), I have an actual math degree, alongside a physics degree and a PhD in astrophysics from a strong department; I have worked in companies both at the MAANG scale and companies as small as 20 people for the last decade. It feels gross to have to type this but evidently you seem blocked from considering other viewpoints because you think I am a "vibe coding PM". It's ok if you want to cling to this as a comfort but just know it's a troubling way of going through life and also happens to be leading you astray in this particular case.

I don't see really any hard source at all here from you except anecdotes that you seem to hold in very high regard. I do see an incredible amount of condescension and chest pounding about what is ultimately a very technical and...ahem...mathematical topic. I don't know about you but I don't really see many conference paper reviews that start with "I am not here to enlighten every vibe coding PM that shows up". I am sure you would agree with me.

> But for the sake of bringing you closer, the "agentic" code is often very inferior, implementing happy paths or just bluntly exposing secrets in clear texts, etc. Probably a consequence of it being trained on, as you put it "p50 engineering code".

I do appreciate this tiny delicious gift of "bringing me closer" because it (1) answers my question about "inherent" properties of agentic systems by giving anecdotal examples of existing systems, (2) completely misunderstands how agentic coding models are trained. Human code traces are a bootstrap to an RL with verifiable rewards stage. Not having the same "you are too beneath me to explain my wrong opinions" attitude, I will genuinely explain a bit because this isn't as trivial and obvious as you make it sound, nor is it a giant pissing contest. Likely the most important property of coding agents that has resulted in their existing and future success is that they are not limited by the quality of human training data. Seems to be a very common misconception, but this is, like you say, just math:

- Agentic coding models like Claude go through several complex training stages

- Pretraining which is kind of a compression step and gives them semantic understanding and a bit more

- Supervised fine tuning which gives them some task specific performance (this is where human traces and verified synthetic traces are used)

- Alignment to make them not give you meth recipes and to behave in the way you want agents to behave

- Reinforcement learning with verifiable rewards (RLVR): then they go forth and solve open ended questions. RLVR is not new mathematics, we know what happens when you take RL with good rewards and throw a bunch of compute at it, we've known that for decades now. This is where the "superhuman" performance comes in, it's not some "vibe coding PM" that's giving you an empty promise, it is the math that you and I so highly revere that promises you this.

> Running my own company and been paying the LLM-Shit-Generators for my whole team for a long time, in the hope they would bring the advertised benefits. Guess what - for serious use-cases, they bring shit and more shit.

This sounds like the experience I would mostly expect from a small company adopting Claude, it is not magic nor is it at the point where you can blindly trust it to not mess something up. It will waste your time. I find it kind of doubtful it has not given you any benefits, but I'm not sitting where you're sitting so I can't refute your experience. People talk resentfully about "advertised benefits" but then never cite what advertised benefits they interpreted these systems as having. Do you have like a quote or something that you can point at?

> Oh yeah obviously not, I mean, its not like understanding software development would help you understand how LLMs are not similar to a "p50 engineer" at all:). I'd take the latter over the former every time.

Maybe I misinterpreted you: I found you telling me to "read a book" to be more of a dismissive condescending comment but maybe you mean it sincerely in which case, sure I will continue to read programming books and following the published work in the field as I have done for years now.

> Well for one, LLMs are not humans, but it should be obvious to even to most cretinous of the e/acc crowd. It's not like they can think in abstract terms or come up with completely new concepts. But then again, don't mind me - if you can live with below average AI slop - go for it.

I do agree with you that LLMs are not humans but when you say this is obvious and then don't back it up, that is really not convincing. I think you overestimate the capability of human beings and underestimate the asymptotic capabilities of these systems. Their performance improvements are predictable and these predictions continue to hold. It seems the burden of proof is on you to explain why we should expect some sort of fundamental limit to these capabilities and where those fundamental limits would arise. I'm not aware of very many.

> have an actual math degree, alongside a physics degree and a PhD in astrophysics from a strong department

Good for you, I suppose, but all it tells me is that you have probably not developed software professionally - after all, PhDs in astrophysics "from a strong department" rarely end up in commercial software development...

> This sounds like the experience I would mostly expect from a small company adopting Claude

Who said it was a small company? You're making too many assumptions buddy :)

> will genuinely explain a bit because this isn't as trivial and obvious as you make it sound

It is literally the same technology developed in the 1940s mate, adding more GPUs will not magically make it become a god-in-the-box. How fucking innovative can you still claim it to be?

> I think you overestimate the capability of human beings and underestimate the asymptotic capabilities of these systems

Right, remember when LLMs constructed the rockets and modules for landing on the moon, using practically just the logarithmic tables? Or when they invented the vaccine? How about X-rays? Cars? Aeroplanes? You don't? Oh right, me neither! We must be downplaying their nonexistent "capabilities". And the use of word "asymptotic" - is absolutely not conveying the meaning you think it does.

> Do you have like a quote or something that you can point at?

Well, how about the CEOs of companies claiming to be worth 1T and upwards, stating that their products have almost superhuman intelligence? PhDs in the pocket etc?