Hacker News new | ask | show | jobs
by saghm 34 days ago
[flagged]
4 comments

> "No way to prevent this", Says Only Language Where This Regularly Happens

   clang -fbounds-safety ...
also see lib0xc etc.: https://news.ycombinator.com/item?id=47978834
NOTE: This is a design document and the feature is not available for users yet.

https://clang.llvm.org/docs/BoundsSafety.html

It has been available in Apple's version[1] for several years, and it appears to be migrating into upstream as well.

On macOS you can try it with:

    clang -Xclang -fbounds-safety program.c
Microsoft also seems to be using it (see above link regarding lib0xc).

[1] https://github.com/swiftlang/llvm-project

And you see a lot of other languages being used to create operating systems with complicated multiprocessor and locking semantics?
It's been almost half as long since the operating system under discussion as it has been since the creation of the language under the discussion, and there haven't really been any new mainstream operating systems created since then. I don't think it's nearly as obvious as you're implying that if there were a new operating system created today that C would be a good choice for it. If we're talking about non-mainstream OS's, then I'd argue there's already more than enough evidence that safer languages than C are more than capable of it[1]

[1]: https://hubris.oxide.computer/

Obviously the way to prevent this is by bounds checking, which is literally in the `770594e` patch. It's just a bug and they happen routinely in all languages. Since this is doing pointer arithmetic, it could just as easily happen in unsafe Rust, for example.
Like they said, "no way to prevent this" (kind of bug from happening again).
Static analysis and other tools can find this, but they're expensive; wonder what the kernel team has access to?
If static analysis could actually find these issues with a reasonable false positive rate, the companies behind them would be running them on Linux to get the publicity of having found the issues like all the AI companies are doing now. Imo the good static analysis heuristics are already built into compilers or in open source linters.
The cheap, low-hanging "fruit" lint rules have been added to today's C/C++ compilers. But these rules can be fragile, depending on what level the static analysis scan occurs - source-code-level-textual pattern matching or use of an AST/parse tree.

Possible problems within a function should be discoverable.

This particular bug would be hard to discover for a typical linter unless they knew/remembered that there are two execution paths for cleanup of a given element.

If not static analysis what would ai tools be considered? They're operating off the same source code

Also nice the onion reference by op.

It's a reference to Xe Iaso's blog (e.g. https://xeiaso.net/shitposts/no-way-to-prevent-this/CVE-2025...), which is itself a reference to The Onion.
It's possible I had seen that blog post and not remembered! I was intending to reference the Onion though (and even googled to make sure I had the wording right), but seeing someone else make the same joke and forgetting is certainly something I would do
"static analysis" is usually deterministic rules you can e.g. put in CI. AI is also somewhat dynamic in that it can execute commands to try stuff out. The best AI vuln finding harnesses work that way, by essentially putting the AI inside of a fuzzer-like environment and telling it to produce a crash.
Coverity scans several open source projects for free. see https://scan.coverity.com/faq and https://scan.coverity.com/projects

see https://scan.coverity.com/projects/linux for the linux-specific scan results - you need to create an account to view the reported defects.

This past couple of weeks isn't a good look for them with the releases of defects found in Linux and Firefox.

Linus himself wrote a static analyzer. https://en.wikipedia.org/wiki/Sparse

There are other free ones, I don't know if they're run as a matter of course.

Technically, the kernel team is sufficiently competent to design and build bespoke tools for themselves. It‘s probably a question of risk assessment and priorities.
sure, but with unsafe Rust you have a very clear marking for the section of code that requires additional care and attention. it is also customary to include a "SAFETY" comment outlining why using unsafe is OK here
You actually kind of don't, I use like a zillion crates which have unsafe Rust in them and it's not like I'm sitting here reading every single line of their code. I like Rust for various reasons, but its memory safety is (imo) overstated, especially when doing low-level stuff.
Almost all rust (95%) is safe rust. You can opt out of array bounds checks with unsafe { array.get_unchecked(idx) } instead of just typing array[idx]. But I can't remember the last time I saw anyone actually do that in the wild. Its not common practice, even in most low level code.

Rust is bounds checked by default. C is not. Defaults matter because, without a convincing reason, most people program in the default way.

But one would have to explicitly choose to use unsafe Rust for this instead of ordinary safe Rust. And safe Rust has no particular difficulty writing to slots in an array or slice or vector specified by their index.
except nearly everyone uses unsafe rust
No they really don't. 95% of rust is safe rust[1].

Also unsafe rust doesn't remove bounds checks. arr[idx] is bounds checked in every context.

You can opt out of array bounds checking by writing unsafe { arr.get_unchecked(idx) } . But thats incredibly rare in practice.

[1] https://cs.stanford.edu/~aozdemir/blog/unsafe-rust-syntax/

> 95% of rust is safe rust.

Based on the raw number of assorted crates, which has no bearing on kernel code. The more relevant question is, can a performant, cross-architecture, kernel ring-buffer be written in safe Rust?

Hubris, an embedded RTOS-like used in production by Oxide, has ~4% unsafe code in the kernel last I checked. There’s a ring buffer implementation that has one unsafe, for unchecked indexing: https://github.com/oxidecomputer/hubris/blob/master/lib/ring... (this of course does not mean that it is the one ring buffer to rule them all, but it’s to demonstrate that yes, it is at least possible to have one with minimum unsafe.)

It’s always a way lower number than folks assume. Even in spaces that have higher than average usage.

I doubt it, but you can probably get pretty close.

This is something a lot of people misunderstand about unsafe rust. The safe / unsafe distinction isn't at the crate level. You don't say "this entire module opts out of safety checks". Unsafe is a granular thing. The unsafe keyword doesn't turn off the borrow checker. It just lets you dereference pointers (and do a few other tricks).

Systems code written in rust often has a few unsafe functions which interact with the actual hardware. But all the high level logic - which is usually most of the code by volume - can be written using safe, higher level abstractions.

"Can all of io_uring be written in safe rust?" - probably not, no. But could you write the vast majority of io_uring in safe rust? Almost certainly. This bug is a great example. In this case, the problematic function was this one:

    static void io_zcrx_return_niov_freelist(struct net_iov *niov)
    {
        struct io_zcrx_area *area = io_zcrx_iov_to_area(niov);

        spin_lock_bh(&area->freelist_lock);
        area->freelist[area->free_count++] = net_iov_idx(niov);
        spin_unlock_bh(&area->freelist_lock);
    }
At a glance, this function absolutely could have been written in safe rust. And even if it was unsafe, array lookups in rust are still bounds checked.
"unsafe Rust" is not a binary; you don't opt into it for every single line of code. Given that the entire premise behind the idea that using C instead of Rust is fine is that people should be able to pay close attention and not make mistakes like this, having the number of places you need to look be a tiny fraction of the overall code that's explicitly marked as unsafe is a massive difference from C where literally every line of the code could be hiding stuff like this.
> except nearly everyone uses unsafe rust

Really? Why? I've not used Rust outside of some fairly small efforts, but I've never found a reason to reach for unsafe. So why is "nearly everyone" else using it?

Let's say you want to call win32 (or Mac) OS functions, all of a sudden you're doing all kinds of wonky pointer stuff because that's how these operating systems have been architected. Doing unsafe stuff is pretty inevitable if you want to do anything non-hello-world-ish.
> Doing unsafe stuff is pretty inevitable if you want to do anything non-hello-world-ish.

So the vast majority of Rust projects involve writing at least one unsafe block? Is that really your claim?

Making use of win32 functions doesn't turn off bounds checking in your rust code.
A tiny fraction of programs need to use win32 or Mac OS functions beyond the standard library or other safe wrappers for said functions.
So what? Just because you used the keyword `unsafe` to call an unsafe API does not mean that you are going to use unsafe pointer access to write to a vector.
That's not prevention. That's remediation.
Surely nobody could create a better language in 50 years. Surely we can't fix these issues.