Hacker News new | ask | show | jobs
by qsera 27 days ago
I think it makes sense to stay away from large code bases built using LLMs until it is proven that it is possible to also maintain such code bases using LLMs or using reasonable human effort.
2 comments

It's alarming how people instantly jump to conclusions that Bun is now "AI slop".

Bun has been almost entirely worked on by LLM's for ~6 months now, long before the Rust re-write (source: https://x.com/jarredsumner/status/2054525268296118363). It already has been proven that LLM's can maintain such codebases.

Bun never was great in terms of stability. It has been vibe coded for 6 month but code was reviewed by a person.

>It already has been proven that LLM's can maintain such codebases.

Proven is a strong word. In my experience AI fails miserably at anything beyond junior level tasks. We will see soon, once bun goes into production.

> Bun never was great in terms of stability

It's very easy to throw shade like this on software if you've got a bugbear with it. I'm sure you can even come up with a bunch of these "stability" problems when challenged on it. I know I could, for basically any large piece of software that I've ever used.

But really, is bun worse in this regard than any other similarly ambitious open source software within it's first few years?

see that's fine with me if they want to take a year or two of human time and do the rewrite properly

this is a piece of software with no architecture, and whose owners have no regard or respect for architecture. I can virtually guarantee that on average every bug they fix will create one new bug, because that's what it's like to work on software with no intentional architecture

What are you talking about?? Bun in Rust is a port, almost exactly the same code base on a different syntax. The architecture did not change at all. Amazing how people comment without even knowing what they are talking about.
Zig and Rust are significantly different languages. If bun has a good architecture in zig (which I don't know if it does or not), that doesn't necessarily mean it had a good architecture for rust. A direct translation of zig code would probably result in pretty unusual rust code, and probably a lot more unsafe usage than if it had been originally written in rust.
Very amazing indeed. Here you are making bold assumptions about a huge pile of code not a single human being has ever read in any meaningful amount.
Nobody reviewed resulting code. Maybe all tests are empty and this is why they pass. Maybe tests were modified to pass because this is the only thing LLM could do to make them pass. Maybe it hallucinated something in the process. We have no idea.
> It already has been proven that LLM's can maintain such codebases.

Is it? Seems like bugs in Claude Code are getting out of hands. That project has a bit more lifetime.

Is it that, or is it just that every software developer, enterprise, dev and non-dev alike has their eyes on Claude Code as the most popular software project ever? Software in general has tons of bugs. People need to understand scale here, and what this looks like in practice. They're doing an incredible job given the circumstances.
> Claude Code as the most popular software project ever

I don't think that's true? The likes of Chrome, Linux, curl, sqlite, etc, are much more widely used.

I'm not being literal. Revolutionary technology arrives on the scene, is extremely visible, changes a whole industry and frankly creates an entirely new economy. All eyes on Anthropic.

They don't get enough credit for being right in the middle of a revolution, yet still managing to ship something that largely works incredibly well, because this thing is a workhorse.

They don't get enough credit? Anthropic is making an insane amount of money. More popular software projects have made exponentially less and have received even less media attention/fame/whatever metric you might use to define success.
> It already has been proven that LLM's can maintain such codebases.

It hasn't. Those are two different scenarios. The first is individual PRs into an existing, majority human-authored and understood codebase where the PRs are initiated and merged by humans even if the code is AI generated. The second is AI rewriting AI written code that no human eye has seen. Bun took a conservative, transliteration file-by-file approach so they still understand the data structures and architecture so they will probably be okay though.

Worked on by LLMs is fine, but the rust pr proved no one is reviewing anymore. You cannot review 1M LOC in 5 days.
> Bun has been almost entirely worked on by LLM's for ~6 months now

So what you’re saying is that this boycot is 6 months overdue?

I think what they're is all is well as long as they aren't told that LLMs are doing most of the work. Being in the know is the issue here IMO as they would've continued using without a word otherwise.
That explains the massive rise in segfaults since they got acquired.

It’s approaching being as buggy as claude code which I’ve had to stop using even though I have 6 months free of max because it just crashes and freezes all the time.

It's alarming how people are willing to overlook the obvious in-your-face sloppiness of the Bun rewrite. A million lines of code in 9 days, pushed to main branch, forced on the existing userbase irresponsibly.

Nobody understands the code, nor will they be able to maintain it without AI service as an external dependency. Give me a break, I'm not running that monstrosity on my machine. Everyone running production software should move away from Bun purely as a technical decision.

Do you use Claude code on your machine? That seems mostly vibe coded
1. I don't use Claude Code, no.

2. It's amazing that a CLI wrapper is as buggy as it is.

3. Nevertheless, it's useable, and maybe for a CLI that's enough. I don't want a JS runtime running production to be the same mess.

Claude Code isn’t a runtime that I use to execute my code with.
If you use it to write code for you, then it kind of is, indirectly.
That is quite the stretch you're making.
that seems comparable to taking a dev-time dependency, while bun is a runtime dependency. THey need to be treated very differently.
Fair point, wasn’t considering it from this angle.
6 months is plenty of time to keep ignoring serious tech debt. I don't think your conclusion follows at all from such a short time.
Yeah, bizarre and sad. And, unsurprisingly, Hacker News seems sympathetic. They are getting very old.
I have an idea on how to tell if a codebase is rotting under AI Agent maintenance. We can collect and analyze how the coding agent reads code during programming tasks, and see if the code access and token consumption are steadily increasing for similar development tasks. If the code readability doesn't degrade for the agent, the maintainability of the codebase should be fine.
Mist of human written codebases are unusable for llm dev by that definition.
Turns out that if they're unusable by LLMs they're likely unusable by human devs. If you follow sane clean coding principles (like not having godclasses) it turns out coding agents (and humans!) can understand and navigate your codebase, especially if you use recognizable patterns, even with very light documentation.
One of these days you’ll learn about “enterprise” code
I have seen good enterprise code and bad enterprise code. Clean Code suggests progressive rewriting of bad code.

When you touch a file you have an opportunity for code clean up, add unit tests to ensure your changes break nothing, and refine the code.

Agree, agentic coding seem to have shifted the trade-off about over-engineering, I found clean architecture is a good practice for coding agent, so every task have a clean and limited context, only a few directly connected classes or interfaces is relevent to any local modification.
We judge long-term quality of human codebases (at least OS) by ongoing activity; for LLM codebases maybe a consistent or increasing level of activity is a bad smell?