Hacker News new | ask | show | jobs
by catgary 76 days ago
And even then - I still read the code it generates, and if I see a better way of doing something I just step in, write a partial solution, and then sketch out how the complete solution should work.
1 comments

Unless the solution is going to be more secure, faster, more stable etc, why does it matter?

Will the end user care? “Does it make the beer taste better”?

in a word, maintainability

> maintainability is inversely proportional to the amount of time it takes a developer to make a change and the risk that change will break something

https://softwareengineering.stackexchange.com/a/134863

i could be wrong, but i'm pretty sure that end-users get upset when a change takes a long time or it ends up breaking something for them.

just because people are finding that agents or whatever are speeding changes up now doesn't necessarily mean they won't encounter a slow-down later when the codebase becomes an un-maintainable mess. technical debt is always a thing, even with machines doing the work (the agent/machine still has to parse a codebase to make changes).

What makes you think that AI couldn’t make the same changes without breaking it whether you modify the code or not? And you do have automated unit tests don’t you?

Right now I have a 5000 line monolithic vibe coded internal website that is at most going to be used by 3 people. It mixes Python, inline CSS and Javasript with the API. I haven’t looked at a line of code. My IAM permissions for the Lambda runtime has limited permissions (meaning the code can’t do anything that the permissions won’t allow it to). I used AWS Cognito for authorization and validated the security of the endpoints and I validated the permissions of the database user.

Neither Claude nor Codex have any issues adding pages, features and API endpoints without breaking changes.

By definition, coding agents are the worse they will be right now.

i have a rule of thumb based on past experience. circa 10k per developer involved, reducing as the codebase size increases.

> 5000 line

so that's currently half a developer according to my rule of thumb.

what happens when that gets to 20,000 lines...? that's over the line in my experience for a human who was the person who wrote it. it takes longer to make changes. change that are made increasingly go out in a more and more broken state. etc. etc. more and more tests have to be written for each change to try and stop it going out in a broken state. more work needs to be done for a feature with equal complexity compared to when we started, because now the rest of the codebase is what adds complexity to us making changes. etc. etc. and that gets worse the more we add.

these agent things have a tendency and propensity to add more code, rather than adding the most maintainable code. it's why people have to review and edit the majority of generated code features beyond CRUD webapp functionality (or similar boilerplate). so, given time and more features, 5k --> 10k --> 20k --> ... too much for a single human being if the agent tools are no longer available.

so let's take it to a bit of a hyperbolic conclusion ... what about agents and a 5,000,000 line codebase...? do you think these agents will take the same amount of time to make a change in a codebase of that size versus 5,000 lines? how much more expensive do you think it could get to run the agents at that size? how about increases in error rate when making changes? how many extra tests need to be added for each feature to ensure zero breakage?

do you see my point?

(fyi: the 5 million LoC is a thought experiment to get you to critically think about the problem technical debt related to agents as codebase size increases, i'm not saying your website's code will get that big)

(also, sorry i basically wrote most of this over the 20 minutes or so since i first posted... my adhd is killing me today)

20K lines of code is well within the context window of any modern LLM. But just like no person tries to understand everything and keep the entire context in their brain, neither do modern LLMs.

Also documentation in the form of MD files becomes important to explain the why and the methodology.

Generally speaking, I try to ensure that the LLM is using core abstractions throughout the codebase in a consistent manner. This makes it easier for me to review any changes it makes.
Sort of a devils advocate question. If you write and review your tests and the functional and non functional requirements and the human tests for usability pass, why does the code matter?

Non functional requirements: performance, security, reliability, logging etc?

Because the code is the actual thing, tests can only show that the code fails in certain cases, they don’t actually prove the code is correct.
If you are writing the correct tests that mirror the requirements, why wouldn’t passing tests mean the code is correct?
Because it doesn’t? Thats why the field of formal methods exists.