| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by qarl 1 day ago

It's a shame you're misrepresenting what is actually going on.

In another comment here I explained that I have run a test: asking Claude Code to add a substantial feature to 270 different C programs.

Despite your beliefs - it went extremely well.

2 comments

qustio 1 day ago

Huh, are you confusing me with someone else? I don't doubt Claude Code did that, I do the same for refactors all the time.

But xscreensaver theme tweaks for personal use have a much lower standard for quality control, regression testing, side effects, etc than a kernel used by billions of devices with thousands of interconnected drivers and subsystems.

Not to mention the coordination problem to get every maintainer on board and patches approved for each specific area when working on a project of that scale, even for a relatively narrow change.

Claude Code doesn't really help with that so don't see why the expectation would be a significant speed up (and doing it all in a single patch would definitely be rejected).

link

qarl 1 day ago

Yes, I understand the difference in rigor.

I refuse to believe the six year delay here was getting people to test a patch.

Which, actually, Claude Code will also do quite well.

link

qustio 1 day ago

Not sure why you'd refuse to believe that when a single, simple patch in Linux can take months to make it into a kernel release. Here we're looking at 300 patches scattered throughout a kernel with millions of LoC. That's going to translate to a lot of mailing list back and forth even if every change was accepted on the first try without a fuss.

link

qarl 1 day ago

The lag there is not due to the review time. How many maintainers were involved? 300? Because I'm still finding it hard to understand how the work of 300 people handling 300 commits cannot be parallelized into months (per your own stat.)

link

qustio 1 day ago

To be clear my original statement was that the bottleneck was most likely not mechanical code changes (where CC would have the most direct speedup) but everything else involved in the process (testing, discussion/approval, inclination towards caution, deliberately narrowly scoped changes, etc).

Not that the Linux kernel approval procedures couldn't be streamlined, work couldn't be parallelized, or anything else like that, which would be a different discussion entirely.

You stated that Claude Code could have significantly sped up the process, so the burden of evidence here should be on how specifically these patches would have benefited/time saved from using LLMs. Hand wavingly saying "LLMs = faster" is too vague/broad of a claim without providing any evidence (and also unfalsifiable).

link

qarl 1 day ago

Right.

And what I'm saying is I refuse to believe the Linux kernel approval procedures are that inefficient. Therefore, your belief "bottleneck was most likely not mechanical code changes" is most likely incorrect.

It would be interesting to get the actual answer to this question.

EDIT: Substantially changing your argument after posting isn't nice. But to answer your charge - no - I never made that claim.

link

lelanthran 1 day ago

> In another comment here I explained that I have run a test: asking Claude Code to add a substantial feature to 270 different C programs.

That's a different scenario, though.

Would Claude have performed adequately if it had to add a specific feature to 270 programs buried in a set of 270m program, each of which may or may not have a dependency on one or more of the others, with virtually unbounded results to test?

In terms of tokens alone, that would have been cost-prohibitive. But lets assume that you had the money to do this: it still might not even be possible.

You're confusing "I have these 270 independent programs and want to make this change to all of them" with "I have these 270m lines of code, of which only 270 needs to be changed".

link

qarl2 1 day ago

HackerNews is now censoring my replies. I did the math - all of these patches would have cost around $100.

Let's see if they'll let this account through.

link

lelanthran 23 hours ago

It's like you are not even reading what is being said to you. You can't find the downstream effects using grep!

You can find the "strncpy"s with grep, but you cannot find all the downstream effect of those changes, especially if something downstream is relying on the broken behaviour!

link

qarl2 23 hours ago

Right. I am not claiming Claude Code creates perfect software. I am refuting your claim that using it would be cost prohibitive.

I took the 10 most difficult patches from the git history - the ones that took the most back-and-forth to fix. I asked Claude to write them. Would you like to see the work?

If you believe a human performs better at finding downstream effects - you need to prove that. I see no reason why it should be true.

link

lelanthran 23 hours ago

> If you believe a human performs better at finding downstream effects

Once gain, you are not reading what is being said - no one made that claim!

No claim was made in fact: it was a refutation. Specifically, the refutation is "this is why it took so many years".

link

qarl2 23 hours ago

> no one made that claim!

You did not literally make that claim but your cost argument hinges on it.

Without it, then Claude does about the same as a human and only costs $100.

Apparently I'm reading your comments more thoroughly than you are.

link