| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by petcat 99 days ago

As human developers, I think we're struggling with "letting go" of the code. The code we write (or agents write) is really just an intermediate representation (IR) of the solution.

For instance, GCC will inline functions, unroll loops, and myriad other optimizations that we don't care about (and actually want!). But when we review the ASM that GCC generates we are not concerned with the "spaghetti" and the "high coupling" and "low cohesion". We care that it works, and is correct for what it is supposed to do.

Source code in a higher-level language is not really different anymore. Agents write the code, maybe we guide them on patterns and correct them when they are obviously wrong, but the code is just the work-item artifact that comes out of extensive specification, discussion, proposal review, and more review of the reviews.

A well-guided, iterative process and problem/solution description should be able to generate an equivalent implementation whether a human is writing the code or an agent.

7 comments

sarchertech 99 days ago

A compiler uses rigorous modeling and testing to ensure that generated code is semantically equivalent. It can do this because it is translating from one formal language to another.

Translating a natural prompt on the other hand requires the LLM to make thousands of small decisions that will be different each time you regenerate the artifact. Even ignoring non-determinism, prompt instability means that any small change to the spec will result in a vastly different program.

A natural language spec and test suite cannot be complete enough to encode all of these differences without being at least as complex as the code.

Therefore each time you regenerate large sections of code without review, you will see scores of observable behavior differences that will surface to the user as churn, jank, and broken workflows.

Your tests will not encode every user workflow, not even close. Ask yourself if you have ever worked on a non trivial piece of software where you could randomly regenerate 10% of the implementation while keeping to the spec without seeing a flurry of bug reports.

This may change if LLMs improve such that they are able to reason about code changes to the degree a human can. As of today they cannot do this and require tests and human code review to prevent them from spinning out. But I suspect at that point they’ll be doing our job, as well as the CEOs and we’ll have bigger problems.

LogicFailsMe 99 days ago

I don't see a world where a motivated soul can build a business from a laptop and a token service as a problem. I see it as opportunity.

I feel similarly about Hollywood and the creation of media. We're not there in either case yet, but we will be. That's pretty clear. and when I look at the feudal society that is the entertainment industry here, I don't understand why so many of the serfs are trying to perpetuate it in its current state. And I really don't get why engineers think this technology is going to turn them into serfs unless they let that happen to them themselves. If you can build things, AI coding agents will let you build faster and more for the same amount of effort.

I am assuming given the rate of advance of AI coding systems in the past year that there is plenty of improvement to come before this plateaus. I'm sure that will include AI generated systems to do security reviews that will be at human or better level. I've already seen Claude find 20 plus-year-old bugs in my own code. They weren't particularly mission critical but they were there the whole time. I've also seen it do amazingly sophisticated reverse engineering of assembly code only to fall over flat on its face for the simplest tasks.

sarchertech 99 days ago

That depends on how fast that change happens. If 45% of jobs evaporate in a a 5 year period, a complete societal collapse is the likely outcome.

LogicFailsMe 99 days ago

Sounds like influencer nonsense to me. Touch grass. If the people are fed and housed, there's no collapse. And if the billionaire class lets them starve, they will finally go through some things just like the aristocracy in France once did. And I think even Peter Thiel is smarter than that. You can feed yourself for <$1000 a year on beans and rice. Not saying you'd enjoy it, but you won't starve. So for ~$40B annually, the billionaires buy themselves revolution insurance. Fantastic value.

OTOH if what you're really talking about is the long-term collapse in our ludicrous carbon footprint when we finally run out of fossil fuels and we didn't invest in renewables or nuclear to replace them, well, I'm with you there.

sarchertech 99 days ago

>Sounds like influencer nonsense to me. Touch grass.

I don't even know what this means.

The worst unemployment during the Weimar Republic was 25-30%. Unemployment in the Great Depression peaked at 25%.

So yeah if we get to 45% unemployment and those are the highest paying jobs on average then yeah it's gonna be bad. Then you add in second order effects where none of those people have the money to pay the other 55% who are still employed.

We might get to a UBI relatively quickly and peacefully. But I'm not betting on it.

>finally go through some things just like the aristocracy in France once did.

Yeah that's probably the most likely scenario, but that quickly devolved into a death and imprisonment for far more than the aristocrats and eventually ended with Napoleon trying to take over Europe and millions of deaths overall.

The world didn't literally end, but it was 40 years of war, famine, disease, and death, and not a lot of time to think about starting businesses with your laptop.

LogicFailsMe 99 days ago

And the dark ages lasted a millennium. Sounds like quite an improvement on that. And if America didn't want a society hellbent on living the worst possible timeline, why did it re-elect President Voldemaga and give him the football? And then, even when he breaks nearly every political promise, his support remains better than his predecessor? Anyway, I think the richest ~1135 Americans won't let you starve, but they'll be happy to watch you die young of things that had stopped killing people for quite some time whilst they skim all the cream. And that seems to be what the plurality wants or they'd vote differently.

The good news is that America is ~5% of the world. And the more we keep punching ourselves in the face, the better the chance someone else pulls ahead. But still, we have nukes, so we're still the town bully for the immediate future.

jplusequalt 99 days ago

>You can feed yourself for <$1000 a year on beans and rice. Not saying you'd enjoy it, but you won't starve. So for ~$40B annually, the billionaires buy themselves revolution insurance. Fantastic value.

You are the epitome of the tech bro.

LogicFailsMe 99 days ago

Sure, sure. Understanding how these sociopaths think clearly makes me a tech bro rather than someone who incorporates worst-case scenarios into my planning. Suggesting they would maintain minimum viable society to save their own asses means I'm in favor of it, right? This is why I work remotely.

bhaak 99 days ago

Peter Thiel might be smarter than that but I’m not sure about the other ones.

Look how Musk treated the Twitter devs or Bezos any of his workers or Trump anybody.

LogicFailsMe 99 days ago

They're all quite intelligent. And they're world class experts in saving their own bacon. Doesn't mean they have any ethics though nor any emotional intelligence after decades of being surrounded by toadies and bootlickers.

jplusequalt 99 days ago

>If you can build things, AI coding agents will let you build faster and more for the same amount of effort.

But you aren't building, your LLM is. Also, you are only thinking about ways as you, a supposed builder, will benefit from this technology. Have you considered how all previous waves of new technologies have introduced downstream effects that have muddied our societies? LLMs are not unique in this regard, and we should be critical on those who are trying to force them into every device we own.

raw_anon_1111 99 days ago

Would you say the general contractor for your home isn’t a builder because he didn’t install the toilets?

jplusequalt 99 days ago

I think this argument would be make more sense if you were talking about an architect, or the customer.

A contractor is still very much putting the house together.

raw_anon_1111 99 days ago

The general contractor is not doing the actual building as much as he is coordinating all of the specialist, making sure things run smoothly and scheduling things based on dependencies and coordinating with the customer. I’ve had two houses built from the ground up

LogicFailsMe 99 days ago

I think that's precisely his thinking and don't let him know about all those fancy expensive unitasker tools they have that you probably don't that let them do it far more cost effectively and better than the typical homeowner. Won't you think of the jerbs(tm)? And to Captain dystopia, life expectencies were increasing monotonically until COVID. Wonder what changed?

k3nx 99 days ago

I've struggled a bit with this myself. I'm having a paradigm shift. I used to say "but I like writing code". But like the article says, that's not really true. I like building things, the code was just a way to do that. If you want to get pedantic, I wasn't building things before AI either, the compiler/linker was doing that for me. I see this is just another level of abstraction. I still get to decide how things work, what "layers" I want to introduce. I still get to say, no, I don't like that. So instead of being the "grunt", I'm the designer/architect. I'm still building what I want. Boilerplate code was never something I enjoyed before anyway. I'm loving (like actually giggling) having the AI tie all the bits for me and getting up and running with things working. It reminds me of my Delphi days: File->New Project, and you're ready to go. I think I was burnt out. AI is helping me find joy again. I also disable AI in all my apps as well, so I'm still on the fence about several things too.

andrekandre 94 days ago

  > I'm having a paradigm shift. I used to say "but I like writing code". But like the article says, that's not really true. I like building things, the code was just a way to do that.

i get this; for me i find coding is fun as video games so i don't personally want to turn all code to ai, but what i DO WANT is for it to automate away drudgery of repeating actions and changes (or when i get stuck be a rubber duck for me)... i want to focus my creativity on the interesting parts myself and learn and grow to a better programmer... it may sound crazy but programming is relaxing for me lol

druide67 99 days ago

This resonates. I spent years thinking I enjoyed coding, but what I actually enjoy is designing elegant solutions built on solid architecture. Inventing, innovating, building progressively on strong foundations. The real pleasure is the finished product (is it ever really finished though?) — seeing it's useful and makes people's lives easier, while knowing it's well-built technically. The user doesn't see that part, but we know.

With AI, by always planning first, pushing it to explore alternative technical approaches, making it explain its choices — the creative construction process gets easier. You stay the conductor. Refactoring, new features, testing — all facilitated. Add regular AI-driven audits to catch defects, and of course the expert eye that nothing replaces.

One thing that worries me though: how will junior devs build that expert eye if AI handles the grunt work? Learning through struggle is how most of us developed intuition. That's a real problem for the next generation.

petcat 99 days ago

> A compiler uses rigorous modeling and testing to ensure that generated code is semantically equivalent.

Here are the reported miscompilation bugs in GCC so far in 2026. The ones labeled "wrong-code".

https://gcc.gnu.org/bugzilla/buglist.cgi?chfield=%5BBug%20cr...

I count 121 of them.

sarchertech 99 days ago

If you can’t understand the difference between a bug that will rarely cause a compiler encountering an edge case to generate a wrong instruction and an LLM that will generate 2 completely different programs with zero overlap because you added a single word to your prompt, then I don’t know what to tell you.

petcat 99 days ago

The point is that expert humans (the GCC developers) writing code (C++) that generates code (ASM) does not appear to be as deterministic as you seem to think it is.

sarchertech 99 days ago

I’m very aware of that, but I’m also aware that it’s rare enough that the compiler doesn’t emit semantically equivalent code that most people can ignore it. That’s not the case with LLMs.

I’m also not particularly concerned with non-determinism but with chaos. Determinism in LLMs is likely solvable, prompt instability is not.

jplusequalt 99 days ago

Classic HN-ism. To focus on the semantics of a statement while ignoring the greater point in order to argue why someone is wrong.

anthonyrstevens 99 days ago

I think it's a perfectly fine point. The OP said (my interpretation) that LLMs are messy, non-deterministic, and can produce bad code. The same is true of many humans, even those whose "job" is to produce clean, predictable, good code. The OP would like the argument to be narrowly about LLMs, but the bigger point even is "who generates the final code, and why and how much do we trust them?"

petcat 99 days ago

I argued the greater point? Software code-generation is not deterministic, whether it's done by expert humans or by LLMs.

jcranmer 99 days ago

Compilers are some of the largest, most complex pieces of software out there. It should be no surprise that they come with bugs as all other large, complex pieces of software do.

Kye 99 days ago

This seems to apply easily to LLMs as language coprocessors that can output code. How long was it before people trusted compilers?

sarchertech 99 days ago

If you don't understand the difference between something that rigorously translates one formal language to another one and something that will spit out a completely different piece of software with 0 lines of overlap based on a one word prompt change, I don't know what to tell you.

anthonyrstevens 99 days ago

"rigorously" is doing a lot of heavy lifting here.

raw_anon_1111 99 days ago

As if when you delegate tasks to humans they are deterministic. I would hope that your test cases cover the requirements. If not, your implementation is just as brittle when other developers come online or even when you come back to a project after six months.

sarchertech 99 days ago

1. Agents aren’t humans. A human can write a working 100k LOC application with zero tests (not saying they should but they could and have). An agent cannot do this.

Agents require tests to keep them from spinning out and your tests do not cover all of the behaviors you care about.

2. If you doubt that your tests don’t cover all your requirements, 99.9% of every production bug you’ve ever had completely passed your test suite.

raw_anon_1111 99 days ago

I have never known a human that could or did write 100K lines of bug free working code without running parts of it first and testing.

So humans also don’t write bug free code or tests that cover all use cases - how is that an argument that humans are better?

sarchertech 99 days ago

Not that humans can't write 100k line programs bug free or without running parts of it.

An AI cannot write a 100k line program on its own without external guard rails otherwise it spins out. This has nothing to do with whether the agent is allowed to run the code itself. This is well documented. Look at what was required to allow Claude to write a "C compiler".

This has nothing to do with whether it's bug free. It literally can't produce a working 100k LOC program without external guardrails.

raw_anon_1111 99 days ago

Absolutely no one is arguing that you shouldn’t have a combination of manual and automated tests around either AI or human generated code or that you shouldn’t have a thoughtful design

throwaw12 99 days ago

Valid points. But crucial part of not "letting go" of the code is because we are responsible for that code at the moment.

If, in the future, LLM providers will take ownership of our on-calls for the code they have produced, I would write "AUTO-REVIEW-ACCEPTER" bot to accept everything and deploy it to production.

If, company requires me to own something, then I should be aware about what's that thing and understand ins and outs in detail and be able to quickly adjust when things go wrong

raw_anon_1111 99 days ago

In the past ten years as a team lead/architect/person who was responsible for outsourced implementations (ie Salesforce/Workday integrations, etc), I’ve been responsible for a lot of code I didn’t write. What sense would it have made for me to review the code of the web front end of the web developer for best practices when I haven’t written a web app since 2002?

throwaw12 99 days ago

as a team lead, if you are not aware of what's happening in the team, what kind of team lead is this?

on the other hand, you may have been an engineering manager, who is responsible for the team, but a lot of times they do not participate in on-call rotations (only as last escalation)

IanCal 98 days ago

> what kind of team lead is this?

One that trusts the team?

Knowing what's happening in the team and personally reviewing parts of the code for best practices are very different things. Are the other team members happy? Does development seem to go smoothly, quickly and without constantly breaking? Does the team struggle to upgrade or refactor things? At some level you have to start trusting that the people working know what they're doing, and help guide from a higher level so they understand how to make the right tradeoffs for the business.

raw_anon_1111 99 days ago

As a team lead, I know the architecture, the functional and non functional requirements, I know the website is suppose to do $x but I definitely didn’t guide how since I haven’t done web development in a quarter century, I know the best practices for architecture and data engineering (to a point).

That doesn’t mean I did a code review for all of the developers. I will ask them how they solved for a problem that I know can be tricky or did they take into account for something.

jmalicki 99 days ago

I've actually found that well-written well-documented non-spaghetti code is even more important now that we have LLMs.

Why? Because LLMs can get easily confused, so they need well written code they can understand if the LLM is going to maintain the codebase it writes.

The cleaner I keep my codebase, and the better (not necessarily more) abstracted it is, the easier it is for the LLM to understand the code within its limited context window. Good abstractions help the right level of understanding fit within the context window, etc.

I would argue that use of LLMs change what good code is, since "good" now means you have to meaningfully fit good ideas in chunks of 125k tokens.

raw_anon_1111 99 days ago

I somewhat agree. But that’s more about modularity. It helps when I can just have Claude code focus on one folder with its own Claude file where it describes the invariants - the inputs and outputs.

sarchertech 98 days ago

If you don’t read the code how the heck do you know anything about modularity? How do you know that Module A doesn’t import module B, run the function but then ignore it and implement the code itself? How do you even know it doesn’t import module C?

Claude code regularly does all of these things. Claude code really really likes to reimplement the behavior in tests instead of actually exercising the code you told it to btw. Which means you 100% have to verify the test code at the very least.

raw_anon_1111 98 days ago

Well I know because my code is in separately deployed Lambdas that are either zip files uploaded to Lambda or Docker containers run on Lambda that only interact via APi Gateway, a lambda invoke, SNS -> SQS to Lambda, etc and my IAM roles are narrowly defined to only allow Lambda A to interact with just the Lambdas I tell it to.

And if Claude tried to use an AWS service in its code that I didn’t want it to use, it would have to also modify the IAM IAC.

In some cases the components are in completely separate repositories.

It’s the same type of hard separation I did when there were multiple teams at the company where I was the architect. It was mostly Docker/Fargate back then.

Having separately defined services with well defined interfaces does an amazing job at helping developers ramp up faster and it reduces the blast radius of changes. It’s the same with coding agents. Heck back then, even when micro services shared the same database I enforced a rule that each service had to use a database role that only had access to the tables it was responsible for.

I have been saying repeatedly I focus on the tests and architecture and I mentioned in another reply that I focus on public interface stability with well defined interaction points between what I build and the larger org - again just like I did at product companies.

There is also a reason the seven companies I went into before consulting (including GE when it was still a F10 company) I was almost always coming into new initiatives where I could build/lead the entire system from scratch or could separate out the implementation from the larger system with well defined inputs and outputs. It wasn’t always micro services. It might have been separate packages/namespaces with well defined interfaces.

Yeah my first job out of college was building data entry systems in C from scratch for a major client that was the basis of a new department for the company.

And it’s what Amazon internally does (not Lambda micro services) and has since Jeff Bezos’s “API Mandate” in 2002.

sarchertech 97 days ago

This sounds like an absolute hellscape of an app architecture but you do you. It also doesn’t stop anything but the Module A imports C without you knowing about it. It doesn’t stop module A from just copy pasting the code from C and saying it’s using B.

>almost always coming into new initiatives

That says a lot about why you are so confident in this stuff.

raw_anon_1111 97 days ago

Yes microservice based architecture is something no modern company does…

Including the one that you were so confident doesn’t do it even though you never worked there…

Yet I don’t suffer from spooky action at a distance and a fear of changes because my testing infrastructure is weak…

Either I know what I’m doing or I’ve bullshitted my way into multiple companies into hiring me to lead architecture and/or teams from 60 person startups to the US’s second largest employer.

Did I mention that one of those companies was the company that acquired the startup I worked for before going to BigTech reached out to me to be the architect overseeing all of their acquisitions and try to integrate them based on the work I did? I didn’t accept the offer. I’ve done the “work for a PE owned company that was a getting bigger by a acquiring other companies and lead the integration thing before”

So they must have been impressed with the long term maintenance of the system to ask me back almost four years after I left

krilcebre 99 days ago

You are comparing compilers to a completely non deterministic code generation tool that often does not take observable behavior into account at all and will happily screw a part of your system without you noticing, because you misworded a single prompt.

No amount of unit/integration tests cover every single use case in sufficiently complex software, so you cannot rely on that alone.

raw_anon_1111 99 days ago

I just rewrote a utility for the third time - the first two were before AI.

Short version, when someone designs a call center with Amazon Connect, they use a GUI flowchart tool and create “contact flows”. You can export the flow to JSON. But it isn’t portable to other environments without some remapping. I created a tool before that used the API to export it and create a portable CloudFormation template.

I always miss some nuance that can half be caught by calling the official CloudFormation linter and the other half by actually deploying it and seeing what errors you get

This time, I did with Claude code, ironically enough, it knew some of the complexity because it had been trained on one of my older open source implementations I did while at AWS. But I told it to read the official CloudFormation spec, after every change test it with the linter, try to deploy it and fix it.

Again, I didn’t care about the code - I cared about results. The output of the script either passes the deployment or it doesn’t. Claude iterated until it got it right based on “observable behavior”. Claude has tested whether my deployments were working as expected plenty of times by calling the appropriate AWS CLI command and fixed things or reading from a dev database based on integration tests I defined.

mikeocool 99 days ago

When requirements change, a compiler has the benefit of not having to go back and edit the binary it produced.

Maybe we should treat LLM generated code similarly —- just generate everything fresh from the spec anytime there’a a change, though personally I haven’t had much success with that yet.

raw_anon_1111 99 days ago

It very much does have to modify the binary it produced to create new code. The entire Linux kernel has an unstable ABI where you have to recompile your code to link to system libraries.

icedchai 98 days ago

The Linux userspace ABI is actually quite stable and rarely changes. If this wasn't true, every time you installed a new kernel you'd have to upgrade / reinstall everything else, including the C compiler, glibc, etc. This does not happen.

The Linux kernel ABI (kernel modules, like device drivers) on the other hand, is unstable and closely tied to the kernel version. Most people do not write kernel modules, so generally not an issue. (I did, many years ago.)

raw_anon_1111 98 days ago

Isn’t that the reason that Android phones have piss poor support after being released?

anyonecancode 98 days ago

That may be the future, but we're not there yet. If you're having the LLM write to a high level language, eg java, javascript, python, etc, at some point there will be a bug or other incident that requires a human to read the code to fix it or make a change. Sure, that human will probably use an LLM as part of that, but they'll still need be able to tell what the code is doing, and LLMs simply are not reliable enough yet that you just blindly have them read the code, change it, and trust them that it's correct, secure, and performant. Sure, you can focus on writing tests and specs to verify, but you're going to spend a lot more time going in agentic loops trying to figure out why things aren't quite right vs a human actually being able to understand the code and give the LLM clear direction.

So long as this is all true, then the code needs to be human readable, even if it's not human-written.

Maybe we'll get to the point that LLMS really are equivalent to compilers in terms of reliability -- but at that point, why would be have them write in Java or other human-readable languages? LLMs would _be_ a compiler at that point, with a natural-language UI, outputing some kind of machine code. Until then, we do need readable code.

raw_anon_1111 98 days ago

Me: My code isn’t giving the expected result $y when I do $x.

Codex: runs the code, reproduces the incorrect behavior I described finds the bug, reruns the code and gets the result I told it I expected. It iterates until it gets it right and runs my other unit and integration tests.

This isn’t rocket science.

AstroBen 99 days ago

This is fantasy completely disconnected from reality.

Have you ever tried writing tests for spaghetti code? It's hell compared to testing good code. LLMs require a very strong test harness or they're going to break things.

Have you tried reading and understanding spaghetti code? How do you verify it does what you want, and none of what you don't want?

Many code design techniques were created to make things easy for humans to understand. That understanding needs to be there whether you're modifying it yourself or reviewing the code.

Developers are struggling because they know what happens when you have 100k lines of slop.

If things keep speeding in this direction we're going to wake up to a world of pain in 3 years and AI isn't going to get us out of it.

raw_anon_1111 99 days ago

I’ve found much more utility even pre AI in a good suite of integration tests than unit tests. For instance if you are doing a test harness for an API, it doesn’t matter if you even have access to the code if you are writing tests against the API surface itself.

AstroBen 99 days ago

I do too, but it comes from a bang-for-your-buck and not a test coverage standpoint. Test coverage goes up in importance as you lean more on AI to do the implementation IMO.