Hacker News new | ask | show | jobs
by faisal_ksa 1383 days ago
I wish that we had Rust 30 years ago. Many of these problems would have been solved by the ownership system.
5 comments

This is not a standard memory leak, and would not have been avoided by using rust.

Edited and re-edited: I was too quick to presume commenter was just spouting the common “rust is a panacea” theme. Kernels are all about “unsafe” concurrent access and reentrant code, so rust is not a panacea. For this case of multi-threaded/multi-process access (presumably from ring-0 kernel code accessing shared kernel memory), using rust primitives to help prevent race conditions could make sense (smart pointers), because the code is unlikely to be performance sensitive and the feature is there to protect against a fairly extreme corner case (crazy ad hoc GC for cyclic graph of processes sending each other file descriptors). Reliable discussion on rust for kernel drivers here: https://security.googleblog.com/2021/04/rust-in-linux-kernel... Disclaimer: not a kernel nor rust dev. In past dabbled with embedded kernel debugging. I keep tweaking this edit, because it is complicated!

By my understanding, Rust's ownership model would prevent concurrent access to the socket buffer garbage collector data structures without proper synchronization, which was the source of this bug.

This is in fact an example of a class of bug that Rust's compiler is uniquely able to protect from - other memory safe languages don't make guarantees about concurrent accesses at all - at least not Java, C#, Go, Python, Haskell, OCaml etc. Perhaps Ada does have something?

Robocat is probably correct. Rust doesn't prevent race conditions, just data races. For example a Rust CVE due to a race condition: https://www.cybersecurity-help.cz/vdb/SB2022012101

This CVE appears to be due to a race condition despite using atomics, so likely this could've happened in Rust code. Really to implement this sort of GC I'd wager that unsafe rust would also be required unless an entirely different algorithm was used.

Also this is kernel code running in a kernel contex, so the code can’t just use std::sync::{Arc, Mutex}[1] because Mutex uses user-space pthread_mutex_lock[2]. The implementation is knarly™ multi-threaded kernel code so it would probably require unsafe custom code (rather than using kernel locking and concurrency-management mechanisms which were not used by the existing code for presumably valid reasons). The rust code could easily have had the same fault. Excepts from [3]:

  The VFS layer is a complicated beast; it must manage the complexities of the filesystem namespace in a way that provides the highest possible performance while maintaining security and correctness. Achieving that requires making use of almost all of the locking and concurrency-management mechanisms that the kernel offers, plus a couple more implemented internally
  the kernel may find itself with a set of in-flight Unix-domain sockets that are only referenced by unconsumed (and unconsumable) SCM_RIGHTS datagrams; at this point, it has a cycle of file structures holding the only references to each other.
  there is more complexity than has been described above and some gnarly locking issues involved in carrying out these operations. See Viro's message for the gory details.
rust is magic, but it doesn’t make us infallible.

[1] https://itsallaboutthebit.com/arc-mutex/

[2] https://stackoverflow.com/questions/5095781/how-pthread-mute...

[3] https://lwn.net/Articles/779472/

Today the Rust std::sync::Mutex type (on Linux) just uses a futex, not the unwieldy pthread_mutex_lock (which on Linux ultimately has a futex inside it anyway).

This why Rust's Mutex<[16; u8]> (a Mutex protecting an array of 16 bytes) is significantly smaller than C++ std::mutex (which doesn't protect anything itself). This was already true on Windows, and for a few months it has been true on Linux too.

But you're correct that Rust for Linux doesn't have std, and so doesn't have std::sync::Mutex. However it does have kernel::sync::Mutex which is reminiscent of the standard library Mutex, e.g. Mutex<T> is a thing, locking defaults to giving you a guard with access to the protected contents, and so on. But being the kernel it has unsafe methods that look way more dangerous than I'd be comfortable with. The Linux kernel already needs a mutex type (in C) and so kernel::sync::Mutex<T> builds on that.

That's my guess too. These GC data structures need to be accessed from multiple threads, if I understood TFA, which means they won't compile normally in Rust. That is exactly Rust doing its job and preventing bugs, but it means that the developer then needs to use unsafe (or find a workaround with runtime checks, at the cost of overhead).
You're absolutely right, I had initially understood that revmsg with the MSG_PEEK flag was concurrently accessing GC data structures and presumably corrupting them.

Instead yes, this is essentially a logic bug in the presence of concurrency, and no programming language can help with those. It would have happened just as well with Software Transactional Memory or with Erlang message passing.

> By my understanding, Rust's ownership model would prevent concurrent access to the socket buffer garbage collector data structures without proper synchronization

Possibly. But the first question is whether the person writing this in Rust would have used unsafe. Without knowing more details here, it's hard for me to guess.

> other memory safe languages don't make guarantees about concurrent accesses at all - at least not Java

Well, Java does have synchronized methods. Those lock the entire class. You can imagine writing a "manager" class that encapsulates all the GC data structures here, and that would have made this perfectly safe in Java using existing language features.

Of course, that would have been slower - so, again, it is tempting to use unsafe approaches, even in a memory-safe language like Java, but then you do risk bugs like this.

But of course I do agree that Rust, even with some amount of unsafe, would be a far safer language than C!

The difference though would still be that, if they don't use unsafe or proper synchronization in Rust, their code won't compile. In Java, their code will compile just the same whether they use `synchronized` or not.

Of course the Rust compiler can't force you to write correct synchronization, but it can at least prevent you from forgetting about synchronization entirely.

D sort of does. We have a type qualifier for shared data that is picky about accesses but it's not completely there yet i.e. still requires some knowledge.
Idiomatic Haskell would probably use eg Software Transactional Memory for concurrency; but that would be too high level for the kernel.

A Haskell-like language for kernel development would lean heavily on linear typing. Some prototypes have been made and used.

We had Ada in 1980.
I don't particularly know ADA, but are you implying that it solves the issues that C++ couldn't? If so, why didn't it take off like C++ did?
Proprietary compilers, some limitations in the type system meant the standard library wasn't as useful even though it is a safer language as a result. The verbosity also turned me of initially.
It was also a bit of a PITA before Ada'95 and by which time C had won.
C and C++ also only had proprietary compilers, mostly.

However both were born alongside UNIX and that helped C++ to be quickly adopted by all major C compiler vendors, whereas Ada was always something extra to pay on top.

When targeting UNIX, with C and C++ compilers on the box, who is going to pay extra for the Ada compiler unless required to do so?

> C and C++ also only had proprietary compilers, mostly.

I was thinking more in the 90s where GNU already had a freely available C compiler, but GNAT didn't get a free version until the late 90s, and even then it was built on a fork of gcc you had to download separately until like 2000. It was just a lot of work to get up and running, as opposed to the bundling of C/C++ compilers with Linux that was common, as you say. The initial C compatibility helped C++ a lot too.

GCC only took off because Sun decided to split their UNIX into user and developer editions, and other UNIX vendors followed.

Still same rule applies, when a UNIX shop paid for UNIX developer tooling, usually languages like Ada and Modula-2 weren't in the box, you needed to pay extra.

Ada has some really good synchronization primitives, I don’t know if that would have helped here as I haven’t looked at the problem that closely. I work at a much higher level so I lack the experience at this level. Ada was phased out for C++ primarily because the devs are cheaper.
It very much does. Ada has been designed for safety and concurrency, and for environments where failure is something to avoid at all costs.

However, it still has issues of its own; no language is perfect and the best-suited for all use cases.

It would be impossible to say because it depends on the hypothetical Rust implementation. A kernel needs a huge amount of unsafe, all of which is surface area for these types of bugs.
We had Ada.
Amen brother. Most people will claim that Rust would probably take years to compile on 30 yrd old hardware but I say to them "why is your heart so full of doubt?". You have to believe.

The more you believe and trust Rust, the more limitless your possibilities become for your family, your career and your life!.

That made my day.