Hacker News new | ask | show | jobs
by jandrewrogers 2369 days ago
There are a couple common architectural patterns that are easy to express in C++ but tend to violate assumptions in Rust's safety model. Database engines run into them frequently due to the nature of their core operations, and this ignores that most data structures are intrinsically globals (which Rust doesn't like) due to tight hardware coupling.

Rust assumes that all references to memory are visible at compile-time, and the safety analysis can be applied in cases where this is true with the usual caveats around borrow-checking. The "unsafe" facility is designed to interface with code written in other languages that don't respect Rust's model, and it works well for that. But how do you express the case, common in database engines (because direct storage-backed), where hardware can hold mutable references to most of your address space? There is no way to determine at compile-time if a mutable reference will be unique at runtime or to even sandbox it to a small bit of code. As a consequence, most memory reference are effectively mutable. There are workarounds that will minimize the quantity of unsafe code in Rust if you are willing to sacrifice performance and elegance.

In databases, having many mutable references to the same memory has few safety implications because ownership of memory is dynamically assigned at runtime by a scheduler that guarantees safe access without locking or blocking. This safety model solves the hardware ownership problem, which is why it is used, but it also enables quite a bit of dynamic optimization even if all your references are in software so you'd want to do things this way anyway. In C++, you can make all of this largely transparent on top of explicitly mutable references to memory. Again, you can produce a minimally unsafe version of this in Rust but it is going to be significantly uglier and slower.

As more server software moves to userspace I/O and scheduling models (for performance and scale reasons) it will be interesting to see how this impedance mismatch problem is addressed in Rust.

2 comments

Lest someone gets the wrong idea, Rust makes _mutable_ globals painful to work with; readonly is fine.

As for hardware DMA’able memory, it’s true that it adds friction to work with in Rust. But C or C++ would fall into the same boat - one would need to sprinkle volatile or atomics, as appropriate, to avoid the optimizer from interfering. In Rust, you’d need to do the same (ptr::{read,write}_volatile or its atomics).

I’m having a slightly hard time imagining a db where “most” of the address space is DMA’able. I’ve some experience with kernel bypass networking, which has its own NIC interactions wrt memory buffers, but applications built on top have plenty of heap that’s unshared with hardware. What’s an example db where most of the VA is accessible to hardware “arbitrarily”?

Also, regardless of how much VA is shared, there’s going to be some protocol that the software and hardware use to coordinate. The interesting bit here is whether Rust and its type system can allow for expressing these protocols such that violations are compile-time detectable (if not all, perhaps some). Any sane C++ code would similarly try to build some abstractions around this, but how well things can be encoded is up for debate.

When a typical Rust discussion ensues, it’s commonly implied (or occasionally made explicit) that “write X in Rust” == “write X in safe Rust”. And this is the right default. But I think any non-trivial system hyper optimized for performance will have a healthy amount of unsafe code. The more interesting question, to me at least, is how well can that unsafety (and the “hidden”-from-rustc protocol) be contained.

As for a db scheduler obviating the need for a compiler to arbitrate ownership, that’s certainly true to a degree. But, this comes back to the protocol I mentioned - the scheduler is what provided the protocol and so long as other components work via it, it can provide safety barriers (and allow for optimizations). But again, I don’t immediately see why Rust (with careful use of unsafe) couldn’t do the same. And afterall, everything is safe so long as things play by the (often unwritten or poorly so) rules. Once systems get big and hairy, it gets tougher to stay within the guardrails and that’s where getting assistance from your language/compiler can be very helpful.

Some of the Rust libs/frameworks for embedded/microcontrollers deal with hardware accessible memory and otherwise “unsafe” protocols, but I’ve seen some clever ways folks encode this using Rust’s type system.

You have some misconceptions about how this all works in real databases. Rust experts who have looked into porting these kinds of C++ database kernels have not been sanguine in my experience. This isn't a theoretical exercise, we need to minimize defects and maximize performance.

- All pointers are ordinary, the fact that the same memory can also be DMA-ed by the hardware is immaterial. You do need accounting mechanisms that let the code infer which objects in memory are at risk of being read/written by a DMA operation. No atomics or volatiles required in userspace. Modern database code is effectively single-threaded.

- Most of the address space in a database is DMA-able because most runtime structures in a database engine must be adaptively persistable to storage. There are various workloads that will force different parts of your runtime state to be pageable because they can overflow RAM while operating within design constraints. Unless you are assuming small databases, that complexity is inconvenient but necessary for robust systems.

- C++ is more expressive than Rust when it comes to making hackery like this transparent, most of which is resolved at compile-time in C++. Much of the mechanics can be taken care of with a pointer wrapper that heavily overlaps the semantics of std::unique_ptr, making the code quite clean and natural. Most code never needs to know how that magic happens. C++ compile-time facilities are currently far beyond Rust.

- You can formally verify the scheduler design, and we sometimes do, but actually implementing it efficiently in real code without the borrow-checker losing its mind is a separate concern.

As I originally stated, you can write such systems in Rust while managing the amount of unsafe code. You just wouldn't want to and there would be little to recommend it compared to the C++ equivalent since it would be objectively worse by most metrics.

It seems you’ve already made up your mind and nothing anyone says will change that :).

The volatile I mentioned isn’t due to concurrency of userspace threads, but to avoid the optimizer from eliminating read/write operations. If the src/dst of those memory ops is DMA’d memory touched by hardware, you’d need to do that. Has nothing to do with concurrency.

Capability to spill to disk is certainly needed, no argument. But “most” of the address space and “most” of the runtime structures? Can you elaborate? Is there an OSS example or some paper or any discussion of such a thing in the open?

You can have custom smart pointers in Rust just as well, and back them with your own mem allocator. While there are features in C++ not currently available in Rust, C++ facilities “far beyond” is hyperbole. How well do you know Rust? Genuine question.

It sounds like GP is using "DMA" to mean a memory-mapped file. I recall there was discussion about how to safely handle them in Rust. See https://users.rust-lang.org/t/how-unsafe-is-mmap/19635
Can you point to a concrete example of the "scheduler" pattern you're referring to? I'm not familiar with it, and it's not clear to me from your post how it works.
I might be wrong but I suspect the "schedular" in this instance is the thing granting write access.

Consider a SQL database engine.

Many threads/process can be accessing a SQL database at any given time.

But if two of those actors require write access to a same object, only one will be allowed access and all others are blocked.

So at this high level the concurrency is guaranteed by the engine, which then means at the low level the engine can safely assume the access is exclusive.