Hacker News new | ask | show | jobs
by halestock 3298 days ago
Question for the rust folks - are there any features that wouldn't have been possible without "unsafe"? That is, if rust never had unsafe, would it have been fundamentally limited in any way? Or is it required for e.g. interoperability with C?
5 comments

C and other FFI is fundamentally outside the control of the Rust compiler (or any other non-C compiler) and, furthermore, the foreign functions can have arbitrarily complicated preconditions (or plain bugs) that lead to memory safety violations. This means that, whether it is marked or not, these operations are semantically `unsafe`, as in, they risk memory unsafety because the compiler can't guarantee it. This does mean that all languages with FFI have safety holes.

Additionally, not having the facility for low-level/unchecked code just means that things like optimised data structures/memory management/hardware interaction get implemented either in the compiler or in other languages. The former is much harder to reason about and to modify: one is essentially writing code that generates compiler IR, which is more annoying and error prone that both just writing the code directly and just writing the IR directly (one way to think about this is the compiler is one big `unsafe` block). The latter is unfortunate because it results in impedence mismatches when doing the FFI calls both semantically and with performance, and it also means that code doesn't get to benefit from the usual Rust safe checks and high level features (like ADTs) that are all still available inside `unsafe` blocks.

I'll give you the shortest example: in order to build an operating system in Rust for x86, you need to do this:

  let p = 0xb8000 as *mut u8;
VGA drivers use the memory mapped at 0xb8000 to drive the device. This creates a pointer, p, at that address.

In order to demonstrate this is safe (okay so unsafe isn't in this example, creating p is safe, but writing to/reading from it is not), a language would have to know:

1. That your code is running in kernel mode, that is the entire concept of ring 0 vs ring 3.

2. That the VGA spec specifies that location in memory.

Yeah, in _theory_, you could have a language that does this, but that'd tie your language so, so, so deeply to each platform, that it's not feasible.

This can be extrapolated to all kinds of other low-level things.

> That your code is running in kernel mode, that is the entire concept of ring 0 vs ring 3.

That need not be the case though. You could have a kernel side allocator that sets up the MMU to map that memory to a pointer that you return which lives in the space of the process. The MMU would take care of the required arithmetic to access the memory at its actual location using an offset.

That way you can map resources from real addresses into arbitrary addresses on the user side.

I think the correct term for this mechanism is 'system address translation'.

The language would still have to understand all of that in order to write that kernel side allocator in safe code.
I don't see how that follows. The language can't possibly understand the intricacies of what the MMU is capable of (besides, every MMU is different), and as far as the language is concerned what is returned is simply a valid offset and a length to go with it to indicate where the allocated segment ends.
I think you're strongly agreeing with me. It's not feasible to have in the language.
Can't the prohibitions be modularized?

Like, when you compile for x86 there are a bunch of rules that aren't generally safe, but on that platform they are.

Modularization wouldn't help the fact that you'd still need a module per platform and that's not feasible, see the other replies to my comment. There's just far, far, far too many details.
Do you mean:

* All unsafe operations don't exist.

* All unsafe operations exist, but the literal unsafe keyword and its machinery doesn't exist

The latter is how most ostensibly safe languages work. See Haskell's UnsafePerformIO, Swift's UnsafePointer, and Java's JNI for 3 examples off the top of my head.

The former is just a really gimped language that would have been a pain in the neck to implement libraries for (see other replies for examples).

It also depends on how "deep" you want to get.

A lot of built-in constructs uses unsafe under the hood. Vector (Rust's dynamic arrays) does memory allocation / resizing under the hood for example, and there's no safe way to do it, unless some safer array allocation primitive is exposed.

Also things like mem::swap(x, y) cannot be implemented at all with safe rust. in order to perform swap, you need a temporary variable. That temporary variable would be uninitialized, which Rust does not allow.

Note that in c++ it invokes copy constructor - http://www.cplusplus.com/reference/algorithm/swap/ - but Rust's mem::swap works for types that does not implement the Copy trait (Rust's equivalent to copy constructor).

Same can be said about slice splitting functions, which is primarily used to work around Rust's borrow checker.

You pretty much nailed it with you question. FFI is by nature unsafe since C doesn't have lifetime semantics and traits around thread safety.