Hacker News new | ask | show | jobs
by ibejoeb 22 days ago
People are going to be using a lot less software if the selection criteria include not being no agents.
4 comments

This is a very uncharitable interpretation of the twitter post: "It’s a combination of anthropic’s stance of not doing human reviews or any kind of rational roll out and stabilization."

They mention nothing about agents being used, rather focus on humans in the review cycle and some sort of gated roll-out process. Why we would bin these practices in the name of a faster release cycle is an important question & debate.

I kind of agree, but it goes both ways. Has Jarred said that there was no review? I know that he stated that rust bun passes tests. Now, I don't know the amount or quantity of coverage, but as a thought experiment, let's assume they are good. What does that count for?
I think most people believe it unlikely that one million line of codes can be reviewed in one week, and the fact that tests pass does not imply good code.

I have no idea whether the new or old code is/was good, just pointing out what seems like a plausible thought process for people who object to this rewrite.

I think it is interesting, using your framing, to consider why people may or may not believe that one million lines of code could be reviewed.

I mean, until very recently, the idea that one million lines of code could be written (rather than mechanically translated) in a month was unbelievable.

It is clearly the case that times have changed since the tools have been updated. So if we challenge one assumption, why not also challenge the other?

Bun presumably will have access to Mythos, which is purportedly reviewing million line code-bases (Mozilla, etc.) and uncovering real value for the devs of those projects.

I find it hard to deny extrapolating these trends to this Bun rewrite.

> I mean, until very recently, the idea that one million lines of code could be written (rather than mechanically translated) in a month was unbelievable.

It is still unbelievable, because it still has not happened in this case. The agent wrote it. Nobody thinks it's unbelievable that an LLM can generate a million lines of code in a month. You either do not understand what the detractors are saying or are arguing in bad faith

Perhaps it will happen, but I am yet to see good results from AI code review (it can be useful as an additional review, but not (yet) as the sole source of review).
Yes he said multiple times including to me yesterday that humans won’t code review as a matter of practice going forward.
Wow, that's wild. Is that just bun, or is that the general practice at anthropic now?
No, he's stated the opposite, e.g. https://x.com/jarredsumner/status/2058283214981251080?s=46

But AFAICT he's never suggested they reviewed all the code, and that they didn't seems like a pretty safe assumption given the volume, and timeline.

I personally think the test suite passing counts for something, and I would bet they also set up some pretty intense LLM-powered verification loops and quality gates (which I hope the forthcoming blog post will detail). I've seen mechanical LLM ports that went extremely well (though nowhere near this scale, so we could review the code (which is how I know they went well)).

I think the most hysterical reactions that we are seeing from some people are premature, knee-jerk responses. We're gonna _find out_ if the Rust version really is better than Zig version, and soon.

And even if it is better overall, I think if there is an AI-slop-induced major bug we are definitely gonna know that, too, because we have a highly motivated community of folks ready to tweet the shit out of it the instant it is found.

So even as a pretty heavy daily user of Bun, I'm actually really glad they did this. The value of the public experiment is high, and if new Bun sucks, well, I still have Deno.

yes, because as we know from history without agents there is no internet or technology or anything
What do you mean?

I'm saying that AI is going to develop software from here on. I don't think you can expect that a human is going to review every line of code. Not that it's good, but that's just how it is. It's not so different from manufacturing. A human is not reviewing every weld. I see a lot of sloppy beads, but in a lot of cases, it's good enough.

I'm saying that's self-evidently ludicrous. Software is not like welding. Do you think Notch could have become rich and famous by welding? How about Bill Gates, famous as a really consistent welder?
> A human is not reviewing every weld.

On civil engineering projects, I’m pretty sure a human reviews each weld. For mass-produced things, maybe not, although a company would not look good in a lawsuit if they had inadequate inspection procedures which allowed a fault causing injury or death to occur.

> On civil engineering projects, I’m pretty sure a human reviews each weld.

Nope. It’s sampled.

Yeah because they are not auto regressively generated!
There's no way that AI develops software from now on. It isn't remotely good enough for that, nor has it really gotten better in the past few years. We're going to see a push to use AI, then a move away from it once the dreadful quality of AI slop becomes too obvious to ignore.
It hasn't gotten better in the past few years? Come on...
in some ways it remains exactly the same technology with the same critical weaknesses
"People are going to be eating a lot fewer foods if the selection criteria include not being ultra-processed unhealthy crap".
There was enough software that powered the Internet before 2023. We don't need laundered slop from criminals.