Hacker News new | ask | show | jobs
by dralley 1093 days ago
> Slotmap uses unsafe everywhere, it's a memory usage pattern not supported by the borrow checker. Is disabling the borrow checker really a common pattern in Rust?

Wrapping "unsafe" code in a safe interface is a common pattern in Rust, yes. There is absolutely nothing wrong with using "unsafe" so long as you are diligent about checking invariants, and keep it contained as much as possible. Obviously the standard library uses some "unsafe" as well, for instance.

"unsafe" just means "safe but the compiler cannot verify it".

Unsafe does not disable the borrow checker, though. All of the restrictions of safe Rust still apply. All "unsafe" does is unlock the ability to use raw pointers and a few other constructs.

https://doc.rust-lang.org/book/ch19-01-unsafe-rust.html#unsa...

4 comments

> Obviously the standard library uses some "unsafe" as well, for instance.

Most beautifully, MaybeUninit<T>::assume_init() -> T

This unsafe Rust method says "I promise that I actually did initialize this MaybeUninit<T>, so give me the T".

In terms of the resulting program the machine is not going to do any work whatsoever, a MaybeUninit<T> and a T are the same size, they're in the same place, your CPU doesn't care that this is a T not a MaybeUninit<T> now.

But from a type safety point of view, there's all the difference in the world.

Even though it won't result in emitting any actual CPU instructions, MaybeUninit::assume_init has to be unsafe. Most of the rest of that API surface is not. Because that API call, the one which emitted no CPU instructions, is where you took responsibility for type correctness. If you were wrong, if you haven't initialized T properly, everything may be about to go spectacularly wrong and there's no-one else to blame but you.

Exactly. People miss this all the time when they write off Rust for "needing unsafe to do real programming" or whatever uninformed criticism they're parroting (they've clearly never actually done this "real programming" in Rust). The whole point is to reduce the opportunity for unforced errors by marginalizing the cognitive load required for the programmer to ensure the program is correct. And a program with a few unsafe blocks to `assume_init` some memory that e.g. a driver initialized for you is still infinitely better in that regard than a program that's littered with `void*` everywhere.
> than a program that's littered with `void*` everywhere

Strawman argument. A properly written C++ program isn't littered with `void*` everywhere in the same way that a properly written Rust program isn't littered with `unsafe` everywhere. You build safe abstractions around the ugly low-level pointer handling, you just don't have a keyword for a clear delineation.

> People miss this all the time when they write off Rust for "needing unsafe to do real programming" or whatever uninformed criticism they're parroting

Hard-core Rust proponents also seem to miss this all the time. Because "you basically write the same unsafe code that you would write in C++ but you now have a keyword to mark it" just doesn't imply the same urgency for adopting the language than "you only need unsafe to implement a few primitives in the standard library" does, which always seems to be tacitly implied until called out, and then the critics are "misinformed."

Firstly the delineation clarity is much more valuable than you seem to appreciate. A day one beginner in Rust can see that this stuff is roped off - so they know if they should call a grown-up - and everything which isn't roped off is safe for them. This also benefits an experienced developer when you're not at your best. Lets not write unsafe Rust today, we can do that when the air conditioning works, the coffee machine is fixed and there aren't contractors using power tools in the office.

I also think you very seriously underestimate how much equivalently unsafe C++ you write, and overestimate how much actual unsafe Rust is needed. Philosophically WG21 (the C++ committee) didn't like safe abstractions, so it doesn't provide them. To the point where the C++ slice type std::span is exactly like the safety proposal where it was originally suggested, except with all the safety explicitly ripped out. "We like this safety feature, except for the safety, get rid of that". I am not even kidding.

Most Rust programmers don't need to write any unsafe Rust. They can rely on Rust's promises, about aliasing, races, memory safety, performance characteristics, and they have no responsibility for delivering those promises, it's all done for them so long as they write safe Rust.

The other crucial element is culture. Culturally Rust wants safe abstractions, that applies to the standard library of course, but it also applies to third party code, you can expect other Rust programmers to think your library is crap if it has a method which is actually not safe to call without certain pre-conditions but isn't labelled "unsafe" -- because that's exactly what "unsafe" is for so you're not fulfilling your social contract.

> You build safe abstractions around the ugly low-level pointer handling, you just don't have a keyword for a clear delineation.

The main difference is they are not really safe. It is trivial to accidentally invoke UB with incorrect use of "safe" abstractions in C++ like built-in containers or smart pointers. Keep a reference to a vector element, add a new item to the vector and it will sometimes blow up ;)

I disagree that it is "trivial," at least in the example you stated. This take-reference-then-mutate is exactly the kind of usage that the borrow checker prevents. You have to avoid it systematically in both languages.

The built-in containers are also not the best examples of "safe" abstractions. You can build safer abstractions, and you can employ safer usage patterns of built-in vectors, at non-zero but marginal costs.

The honest view on C++ is that there is no such thing as "safe" in absolute terms, but you have a lot of tools to mitigate the unsafe nature of the core language.

The honest view on Rust is that the idea of categorically excluding memory safety errors didn't quite pan out, but we're nonetheless left with an improvement over C++.

It’s subtle, but you don't avoid “take reference then mutate” in Rust, you are told exactly how to do it without aliasing the memory.

I’m not going to say Rust is perfect, that’s obviously not the case. But I really think your argument, like others are saying, underplays the actual value of Rust.

I’ve written entire projects in both C++ and Rust. I’ve never wasted days debugging memory corruption in Rust. Just sayin’.

If unsafe means “safe but the compiler cannot verify” then I guess just consider .cpp to mean “safe but the compiler cannot verify” and we have suddenly made C++ memory safe
There's a related idea in Haskell, usually considered a memory safe language. You can write a program in Haskell that directly mutates memory, or does IO operations, freely, anywhere in the code. This violates functional purity and the compiler cannot offer its usual promises; your program may very well segfault from a bug in such code. But sometimes you just have to, perhaps to implement an algorithm efficiently.

Still, it is discouraged; both culturally in the language community, and discouraged through the subtle prodding of the language itself (such as everything being typed "IO", or the slightly ominous "unsafe" in the "unsafePerformIO".) Very often, the amount of code that must truly live in IO can be reduced to a few dozen lines, if that. That code is crucial to get right -- it's where the actual sequence of computation and external effects are handled. Such isolation allows the rest of the code to not have to worry about those matters.

Sure, and if a typical Rust program that I write has no unsafe in it directly, and 5% of its dependencies' code have unsafe in them, that's also the same as writing a program in the "not c++" language directly, and using "not c++" dependencies for all but 5% of the dependency code.

Seems like a silly analogy to me, though.

Right but it’s that 5% the origin comment is talking about. The times when rust has to use unsafe for the type of program.
Unsafe Rust is safer than C++, and even if it wasn't, 5% unsafe in Rust programs (in well-marked locations) is vastly superior to 100% unsafe in C++ programs.

Any analogy that equates the two is silly.

unsafe rust is less safe than C++ because of the provenance and aliasing semantics that unsafe rust must adhere to to avoid UB, which are generally tricker than those of C++
The provenance rules in the C++ standard are basically just a shrug emoji†, so it's unclear whether those are worse, I can see an argument for the idea that obeying Aria's strict provenance experiment rules in Rust is easier - not because it's easy (although for many common cases it is) but because at least these are coherent rules.

† U+1F937 person shrugging

The core value proposition of rust is that it’s memory safe by default, and it’s possible to limit the set of code that needs to be manually checked for UB. This isn’t the case for C++, as any code anywhere can invoke undefined behavior.
True, as long static analysers aren't part of the build, at which time specific constructs can be made to break the CI/CD build, forcing everyone to play by the rules if they want the PR to go through.

It isn't perfect, but does improve a lot the security baseline.

Sure but you're missing the

> so long as you are diligent about checking invariants

part. Could you go through and check all the parts of a huge C++ codebase to make sure invariants are held as opposed to a few hundred lines of unsafe Rust code?

Sure, but I think the point here is the degree.

Presumably if it takes a lot of unsafe rust lines to build something, it won’t matter if it’s 30% safe or whatever.

I just see the point of “unsafe is fine” a lot when the whole point of rust is that memory safety issues are never worth the cost.

Right, I guess the question is what will that proportion be when Rust is used for things like operating systems and web browsers. 30% would be untenable but a few hundred/thousand lines of unsafe code is fairly easy to put under a microscope.

For some current day research into this, there is the paper "How Do Programmers Use Unsafe Rust?"[1] which I'll drop a quote from here:

> The majority of crates (76.4%) contain no unsafe features at all. Even in most crates that do contain unsafe blocks or functions, only a small fraction of the code is unsafe: for 92.3% of all crates, the unsafe statement ratio is at most 10%, i.e., up to 10% of the codebase consists of unsafe blocks and unsafe functions

That paper is definitely worth reading and goes into why programmers use unsafe. e.g 5% of the crates at that time were using it to perform FFI.

In writing "RUDRA: Finding Memory Safety Bugs in Rust at the Ecosystem Scale" [2], I recreated this data and year-by-year the % of crates using unsafe is going down. And for what it's worth, crates are probably a bad data-set for this. crates tend to be libraries which are exactly where we would expect to find unsafe code encapsulated to be used safely. There's also plenty of experimental and hobby crates. A large dataset of actual binaries would be way more interesting to look at.

[1] https://dl.acm.org/doi/10.1145/3428204

[2] https://taesoo.kim/pubs/2021/bae:rudra.pdf

Or Rust in Android, in this deep dice gaining two places of unsafe code which found a bug in the existing implementation due to the vetting triggered by being the only two places.

https://security.googleblog.com/2022/12/memory-safe-language...

Ahh that is quite interesting, I’ll check those links out
Looking at a couple of programs I work on:

9,500 lines of code, 8 are unsafe.

7,000 lines of code, 22 are unsafe.

14,000 lines of code, 140 are unsafe.

As we follow the standard rust rule that "safe code should not be able to use unsafe code to do unsafe things", those unsafe bits of code have been very carefully checked, to the best of our abilities, to ensure they don't create memory safety issues. It is a lot easier to triple-check 170 lines of code than 30,000 lines.

Sharpview, my metaverse viewer: 36,000 lines, 0 are unsafe.

I use some published crates that have unsafe code, but my own programs start with

    #![forbid(unsafe_code)]
This is 60FPS 3D high-detail graphics stuff, where performance matters.
Are you using wgpu for the rendering stuff? Heard that WebGPU had to sacrifice some performance in order to make the API safer for the web (like more bounds checking and sanity checks). These kinds of issues are actually plaguing projects like Tensorflow.js (for example see https://github.com/gpuweb/gpuweb/issues/1202).

Other libraries like Vulkan and DirectX 12 are fundamentally unsafe in the API level, so direct usage of it would lead to heaps of unsafe Rust code. Rust people have tried wrapping it in a safe way (like gfx-rs and vulkano) but nowadays most seem to have transitioned to wgpu (since WebGPU API is safe by design so it fits more for the Rust ecosystem).

Rust does sacrifice some performance in general in order to achieve its safety claims, but people are happy with it so far, since the majority of applications using Rust (like CLI apps and web servers) don't have to squeeze out performance that much (for webdev there are too many things that can cause performance issues other than not writing it in Rust). But for 3D graphics people can be more sensitive about these problems. Though maybe if you're not developing a triple-A game with the latest cutting-edge graphics (with new techniques like "hardware ray tracing" and "bindless descriptors", which are both impossible in wgpu), writing in Rust can be a good-enough tradeoff for your needs.

A counterpoint that makes this argument a bit weaker: Rust’s “unsafe” marker doesn’t pollute only its scope and actually pollutes the whole module; You need to make sure that the invariants in unsafe code are met even in safe code. (An explanation of this in the Rustonomicon: https://doc.rust-lang.org/nomicon/working-with-unsafe.html)

So there’s quite a lot more code to actually check then what some of the Rust proponents are saying. One can say that C++ is still worse in this regard (theoretically you need to check 100% of your code to be safe in C++). But for some minority of developers who frequently needs to delve into unsafe code, the advantages of Rust might seem a bit more disappointing (“the compiler doesn’t really do that much for the more important stuff…”)

> whole point of rust is that memory safety issues are never worth the cost

I don’t think that it would be the point of rust — otherwise why not write Java, or a litany of GCd languages instead?

Rust is a low-level/systems programming language where you have more control over the program’s execution (e.g. no fat runtime), which is a necessity in some rare, niche, but important use cases.

It almost never takes a lot of unsafe to build something. Even the redox OS kernel is only like 10% unsafe.
It's not what unsafe means. Unsafe means this might cause UB for some invocations (accessing raw pointers, calling into another language, etc.). Safe means it will not cause UB for any invocations (it may panic or abort).
It's essentially a "user-space" memory allocator with it's own use-after-free and double-free checks, apparently because the language implementation isn't adequate. If anything it just reinforces the articles point that "borrow checking is incompatible with some useful patterns and optimizations."
Eh? This is a wild take. How do you draw the conclusion the default implementation is inadequate?
Because something like slotmap has to use `unsafe` to get around the inadequacies of the borrow checker...
Author of slotmap here.

There is absolutely no need for unsafe in slotmap. I chose to use unsafe (wrapped in a safe API) to reduce memory usage using intrusive linked freelists. If done using safe Rust this would involve `enum`s that would take up extra space.

Slotmap is one of my favorite crates, by the way. Thank you for putting it out there!
Thus not adequate for performance requirements.
.... that has nothing to do with the allocator though
A downside for sure, but one that, at least in this specific example, has limited downsides. If you can button it up into a safe abstraction that you can share with others, then I don't really see what the huge problem is. The fact that you might need to write `unsafe` inside of a well optimized data structure isn't a weakness of Rust, it's the entire point: you use it to encapsulate an unsafe core within a safe interface. The standard library is full of these things.

Now if you're trying to do something that you can't button up into a safe abstraction for others to use, then that's a different story.

> "unsafe" just means "safe but the compiler cannot verify it".

"unsafe" means "safe"?

I would say "unsafe" means "only safe if used in a manner that cannot be checked by the compiler".

There are two things here. The `unsafe` in an `unsafe { ... }` block is referring to the contents of the block. From the outside it is indeed safe to use as if it were safe code. No special requirements necessary. So, yes, from a certain point of view `safe` would have been a better name (albeit confusing in a different way).

An `unsafe fn` however does need to be used correctly (and should document those requirements). However, these can only be called within `unsafe` blocks, so see above.

Not entirely correct, Rust’s “unsafe” marker doesn’t pollute only its scope, it actually pollutes the whole module; You need to make sure that the invariants in unsafe code are met even in safe code. (An explanation of this in the Rustonomicon: https://doc.rust-lang.org/nomicon/working-with-unsafe.html)