Hacker News new | ask | show | jobs
by nirui 7 days ago
> Speaks volumes to the strengths of the language

Memory safety is just a tiny part of over all security. If a LLM can transcode correctly, then it should also output 100% correct C code.

On the other hand, If a LLM cannot correctly transcode, then using Rust may just make the bug soundless, because the language runtime/code-gen "avoided" usual punishments that might make the bug (and bug report) obvious.

6 comments

> Memory safety is just a tiny part of over all security.

No, it's a pretty massive part with disproportionate severity.

> If a LLM can transcode correctly, then it should also output 100% correct C code.

Translating code seems to largely rely on having a strong suite of existing tests, not on ability to code correctly.

It's unclear if LLMs are great at writing safe C code, it's much clearer that they can meet targets with external feedback properties like "test passes/fails".

> On the other hand, If a LLM cannot correctly transcode, then using Rust may just make the bug soundless, because the language runtime/code-gen "avoided" usual punishments that might make the bug (and bug report) obvious.

This is very unclear to me.

> Memory safety is just a tiny part of over all security.

70%[1][2] is tiny?

[1]: https://www.zdnet.com/article/microsoft-70-percent-of-all-se... [2]: https://www.chromium.org/Home/chromium-security/memory-safet...

> On the other hand, If a LLM cannot correctly transcode, then using Rust may just make the bug soundless, because the language runtime/code-gen "avoided" usual punishments that might make the bug (and bug report) obvious.

Isn't it the other way around? Rust guarantees lack of undefined behavior in safe code. If you have undefined behavior in your code your bug might become a heisenbug, or make the rest of your program behave weird, or the bug might simply be dormant until a very specific situation occurs (i.e. be "soundless" as you say).

If you're going to automatically translate your code from one language to another then a memory-safe target language (whether it's Rust, Java, C# or something else) is the only sane, reasonable choice. And if you want C or C++-like performance (i.e. you want to maximize performance) then you're pretty much left with Rust on the table.

> If you have undefined behavior in your code your bug might become a heisenbug

OR, the OS might kick in and throw a segmentation fault etc, often with some information associated to it.

Again, if a LLM can output 100% correct code, no bug of whatever kind should exist. Seeing a segfault could just invalidate that assumption completely and definitively. That's the point.

> Rust guarantees lack of undefined behavior in safe code

And that don't guarantee heisenbug-free, that just means your heisenbug was fully checked by Rust compiler and is now managed by the language runtime/facilities.

So, now instead of a crashed program and a "sever" DoS vulnerably, you got a disconnected user every time they trigger the bug. The user might assume it's the network, so does your logging stack. After a few times, the user starts bitching about your stupid network, and left for your competitor's product, while you busily trying to figure out why the network suddenly ain't as good as it used to.

> 70%[1][2] is tiny?

It really depends on how they define what counts as "vulnerably", or in Chrome's case, "'high severity' security bugs" which is very specific. Microsoft probably have many decades-old code written before the invention of better checkers, that contributed to the problem.

> Memory safety is just a tiny part of over all security

Sure, in the same way that foundations are just a tiny part of house construction.

Foundations come first, it's great if you've got more on top, I recommend it actually, but if you don't have the foundations it's not even worth having the discussion about what else you built, it's all worthless.

But current LLMs have a context window limitation, so you can't fit your whole source code into the context, that's why compilers guide the LLMs when they are producing code and that's where Rust compiler shines, it has very good diagnostics that help fix the issues with a few iterations.

So while LLMs are good at writing walls of code, they do not produce good code, just good enough and sometimes it is wrong (here is where Rust can help a bit by checking that the program is sound, but for the most part you should also validate the logic).

The dream language for LLMs would be one that has some kind of proving that function inputs/outputs are what you expect (I think it's called proof theory, but it's not my area of expertise, so I could be wrong), you kind of can emulate this with new types[0].

[0] https://doc.rust-lang.org/rust-by-example/generics/new_types...

> If a LLM can transcode correctly, then it should also output 100% correct C code.

Well LLM's cannot transcode perfectly correctly, so the fact that Rust has lots of static checking is really important. Not just for memory safety - Rust helps with many other classes of bugs too.

> then using Rust may just make the bug soundless, because the language runtime/code-gen "avoided" usual punishments that might make the bug (and bug report) obvious.

I think what you're saying here is that LLM's often cheat to solve the immediate error, e.g. by using `unsafe` where you really shouldn't, or just making a test not test anything. That's definitely possible.

> If a LLM can transcode correctly, then it should also output 100% correct C code

An LLM can't (currently) transcode correctly in a vacuum. It needs tight guardrails to keep it on the straight-and-narrow (such as an existing conformance test suite that is extremely comprehensive).

The value of transcoding to Rust specifically is that the compiler gives you a pretty substantial set of guardrails "for free" - in a C port, your conformance test suite would also need to test every aspect of memory safety and fearless concurrency...