Hacker News new | ask | show | jobs
by phdstudent 1011 days ago
> Most maintainers aren't hugely proficient in Rust and absolutely do not have the time to learn.

Can LLMs help here with code (re)writing?

4 comments

The proficiency required to make sure the LLM is not screwing up the code in some subtle way, is probably even higher than the proficiency required to rewrite the code yourself. So probably not.
Basically, you have to review code generated by an LLM the same way that you would review code written by a human programmer, which requires (at least) the same level of proficiency as writing code yourself - if not, as you wrote, higher proficiency.
I'd argue you have to review code generated by an LLM to an even greater degree than that generated by a human programmer because you can to some degree presume that the human writer intends the code to be functional. Whereas, with an LLM, it might just spit out some garbage that it doesn't even intend to be compile-able. In my experience, LLM-written code doesn't pass a threshold where I believe it's ready to force another human being to review it (via Pull Request review.) The LLM generates what it generates, I rewrite what it generated, and then I create the Pull Request. If I can't understand what the LLM generated, as might be the case where the human programmer is not fluent with the language, then having the LLM generate code was a pointless step anyway.
Code written in Rust for LLM would likely be easier to review than C code due to its strong type system and the inclusion of the borrow checker. However, I anticipate there might be more 'unsafe' code, especially at the language boundaries, compared to typical Rust application code. Thus, the benefits might be somewhat diminished.
The whole point of being in kernel space is to write unsafe code -- either unsafe in a memory sense or in a "I am just blasting characters at this pcie port" sense.
Yes and no. It's true that there's some portion of the kernel space code that can't be written provably safely. But there's good reason to believe that a lot of it can and that belief is what's driving this integration.

Whether Rust in the kernel succeeds or not will likely be determined by whether or not a sufficiently clear boundary can be drawn between the bit that must be unsafe (in the Rust sense) and the rest. And how much code is in the latter. I don't think we know the answer yet but some knowledgable people are willing to run the experiment on the basis that the probability seems quite high that a safe subset can be determined.

Sure but if everything you're doing can be done from userspace, then you should probably be in userspace.

In the LLM case mentioned, having to hop a syscall every few teraflops is probably not a compelling reason to live in kernel space.

LLMs rewriting what? A) Rust modules to C so that it can be reviewed and merged by old school kernel hackers? B) C to unsafe Rust so that you lose any advantage of using Rust and get the worst of both world? C) Or rewrite the kernel maintainer's brains while they sleep so they become magically proficient in Rust when they wake up?

If you're talking about option A) or B), we already have things like mrustc and c2rust. These are though problems, LLMs aren't _that_ smart yet.

Others have spoken to the technical merits but I can say straight up that the community would absolutely despise this idea, even if the generated code was excellent. LLM generated code would go over about as well as implementing a blockchain in the kernel.
I’m just a layman when it comes to these kind of things. But here’s my take anyway: AFAICT, the issue isn’t so much about rewriting the code. However, if we would use automation for that, we would probably need proofs that the new code is better than the old one. And creating suchs proofs seems incredibly hard as soon as we walk out of “hello world” territory. People has spent years on trying to make suchs systems, but it’s still not widely used. So until then, we need people who read and analyze the code, and also have wast knowledge of the kernel’s inner workings, and today that seems to be the bottleneck. Those people doesn’t really grow on trees.