Hacker News new | ask | show | jobs
by lacunary 167 days ago
in my experience, what happens is the code base starts to collapse under its own weight. it becomes impossible to fix one thing without breaking another. the coding agent fails to recognize the global scope of the problem and tries local fixes over and over. progress gets slower, new features cost more. all the same problems faced by an inexperienced developer on a greenfield project!

has your experience been otherwise?

5 comments

Right, I am a daily user of agentic LLM tools and have this exact problem in one large project that has complex business logic externally dictated by real world requirements out of my control, and let's say, variable quality of legacy code.

I remember when Gemini Pro 3 was the latest hotness and I started to get FOMO seeing demos on X posted to HN showing it one shot-ing all sorts of impressive stuff. So I tried it out for a couple days in Gemini CLI/OpenCode and ran into the exact same pain points I was dealing with using CC/Codex.

Flashy one shot demos of greenfield prompts are a natural hype magnet so get lots of attention, but in my experience aren't particularly useful for evaluating value in complex, legacy projects with tightly bounded requirements that can't be easily reduced to a page or two of prose for a prompt.

To be fair, you're not supposed to be doing the "one shot" thing with LLMs in a mature codebase.

You have to supply it the right context with a well formed prompt, get a plan, then execute and do some cleanup.

LLMs are only as good as the engineers using them, you need to master the tool first before you can be productive with it.

I’m well aware, as I said I am regularly using CC/Codex/OC in a variety of projects, and I certainly didn’t claim that can’t be used productively in a large code base.

But that different challenges become apparent that aren’t addressed by examples like this article which tend to focus on narrow, greenfield applications that can be readily rebuilt in one shot.

I already get plenty of value in small side projects that Claude can create in minutes. And while extremely cool, these examples aren’t the kind of “step change” improvement I’d like to see in the area where agentic tools are currently weakest in my daily usage.

I would be much more impressed with implementing new, long-requested features into existing software (that are open to later maintain LLM-generated code).
Fully agreed! That’s the exact kind of thing I was hoping to find when I read the article title, but unfortunately it was really just another “normal AI agent experience” I’ve seen (and built) many examples of before.
Adding capacity to software engineering through LLMs is like adding lanes to a highway — all the new capacity will be utilized.

By getting the LLM to keep changes minimal I’m able to keep quality high while increasing velocity to the point where productivity is limited by my review bandwidth.

I do not fear competition from junior engineers or non-technical people wielding poorly-guided LLMs for sustained development. Nor for prototyping or one offs, for that matter — I’m confident about knowing what to ask for from the LLM and how to ask.

This is relatively easily fixed with increasing test coverage to near 100% and lifting critical components into model checker space; both approaches were prohibitively expensive before November. They’ll be accepted best practices by the summer.
No that has certainly been my experience, but what is going to be the forcing function after a company decides it needs less engineers to go back to hiring?
Why not have the LLM rewrite the entire codebase?
In ~25 years or so of dealing with large, existing codebases, I've seen time and time again that there's a ton of business value and domain knowledge locked up inside all of that "messy" code. Weird edge cases that weren't well covered in the design, defensive checks and data validations, bolted-on extensions and integrations, etc., etc.

"Just rewrite it" is usually -- not always, but _usually_ -- a sure path to a long, painful migration that usually ends up not quite reproducing the old features/capabilities and adding new bugs and edge cases along the way.

Classic Joel Spolsky:

https://www.joelonsoftware.com/2000/04/06/things-you-should-...

> the single worst strategic mistake that any software company can make:

> rewrite the code from scratch.

Steve Yegge talks about this exact post a lot - how it stayed correct advice for over 25 years - up until October 2025.
Time will tell. I’d bet on Spolsky, because of Hyrum’s Law.

https://www.hyrumslaw.com/

> With a sufficient number of users of an API, it does not matter what you promise in the contract: all observable behaviors of your system will be depended on by somebody.

An LLM rewriting a codebase from scratch is only as good as the spec. If “all observable behaviors” are fair game, the LLM is not going to know which of those behaviors are important.

Furthermore, Spolsky talks about how to do incremental rewrites of legacy code in his post. I’ve done many of these and I expect LLMs will make the next one much easier.

>An LLM rewriting a codebase from scratch is only as good as the spec. If “all observable behaviors” are fair game, the LLM is not going to know which of those behaviors are important.

I've been using LLMs to write docs and specs and they are very very good at it.

When an LLM can rewrite it in 24 hours and fill the missing parts in minutes that argument is hard to defend.

I can vibe code what a dev shop would charge 500k to build and I can solo it in 1-2 weeks. This is the reality today. The code will pass quality checks, the code doesn’t need to be perfect, it doesn’t need to be cleaver it needs to be.

It’s not difficult to see this right? If an LLM can write English it can write Chinese or python.

Then it can run itself, review itself and fix itself.

The cat is out of bag, what it will do to the economy… I don’t see anything positive for regular people. Write some code has turned into prompt some LLM. My phone can outplay the best chess player in the world, are you telling me you think that whatever unbound model anthropic has sitting in their data center can’t out code you?

Well, where is your competitor to mainstream software products?
What mainstream software product do I use on a day to day basis besides Claude?

The ones that continue to survive all build around a platform of services, MSO, Adobe, etc.

Most enterprise product offerings, platform solutions, proprietary data access, proprietary / well accepted implementation. But lets not confuse it with the ability to clone it, it doesnt seem far fetched to get 10 people together and vibe out a full slack replacement in a few weeks.

If the LLM just wrote the whole thing last week, surely it can write it again.
If an LLM wrote the whole project last week and it already requires a full rewrite, what makes you think that the quality of that rewrite will be significantly higher, and that it will address all of the issues? Sure, it's all probabilistic so there's probably a nonzero chance for it to stumble into something where all the moving parts are moving correctly, but to me it feels like with our current tech, these odds continue shrinking as you toss on more requirements and features, like any mature project. It's like really early LLMs where if they just couldn't parse what you wanted, past a certain point you could've regenerated the output a million times and nothing would change.
* With a slightly different set of assumption, which may or may not matter. UAT is cheap.

And data migration is lossy, becsuse nobody care the data fidelity anyway.

Broken though