Hacker News new | ask | show | jobs
by woodruffw 1203 days ago
> Unsafe Rust is hard. A lot harder than C, this is because unsafe Rust has a lot of nuanced rules about undefined behaviour (UB) — thanks to the borrow checker — that make it easy to perniciously break things and introduce bugs.

I don't think this is correct: Rust makes writing unsafe Rust correctly more onerous than writing C, but the actual rules for undefined behavior are the virtually same as in C: if you alias where you must not, or mutate where you must not, etc. you're in exactly the same boat.

In other words: Rust makes it hard to write unsafe Rust correctly, but no harder than writing well-defined C. The only difference is that Rust raises the safety expectations by default, making unsafe Rust look more difficult than C.

4 comments

I don't agree that the rules for UB are virtually the same as in C. One example: if your unsafe Rust code modifies any memory address for which there exists a reference elsewhere, that is instantly UB. In C, that is not necessarily the case. https://www.youtube.com/watch?v=DG-VLezRkYQ has some good details on this.

Similarly, in Rust you have to be careful to never instantiate a value that is out-of-range for a given type (e.g. a bool with value > 1), even if you will never read or access that value before it is changed to something valid. In C this same concern does not exist since it is not insta-UB in the same way.

This is not correct. Here's a really good video that goes into the differences: https://youtu.be/DG-VLezRkYQ
The rust undefined behaviour rules are stricter than C: mutating a non mutable reference is UB, for example. Non mutable references don't exist in C.
> Non mutable references don't exist in C.

Sure they do: C has a well-defined notion of const-correctness. If you mutate through a `const`, you're invoking undefined behavior.

Both C and C++ allow you to strip `const` from a const-qualified value or reference, but only under the condition that you don't actually modify that value.

Edit: which, in case it isn't clear, means that Rust's UB is exactly the same as C's in this case.

Actually in C++ you are allowed to strip const and modify the value as long as the original object (not necessarily object in an OO sense) isn't const [1].

[1] https://en.cppreference.com/w/cpp/language/const_cast

Yes: the implication was that the original object was `const`. If you both add and remove const, that's well-defined.

(I've yet to see a C or C++ codebase where object provenance actually guarantees this; I've see a lot of C and C++ codebases with const-stripping induced UB.)

In rust, stripping const is UB - even if the original location is mut.
C and Rust don't have more strict rules than each other; each has some things that the other disallows, and doesn't allow some things the other allows. It's not that simple.
Rust has raw pointers and UnsafeCell, but they're quite unidiomatic compared to the C/C++ equivalent. A lot of Rust library code only takes safe references so it's hard to use from an unsafe context.
I would think that modifying a pointer to a const object, e.g.

    const int x = 123;
    const int *px = &x;
    (*(int*)px) = 456;
is very, very UB in C (and most likely will crash on most platforms)

    int x = 123;
    const int *px = &x;
    (*(int*)px) = 456;
is legal in C. The Rust equivalent using & and &mut is UB. Writing this in Rust using raw pointers requires unsafe blocks everywhere, loses method syntax, has no -> operator, etc.
> the actual rules for undefined behavior are the virtually same as in C

1. Creating a mutable reference when there's other references to the same memory around, even if you don't use/deref that mutable reference, is considered UB in Rust; References there have the `dereferenceable` LLVM attribute, so the compiler is allowed to insert use/derefs at will to facilitate optimizations [0]. C's pointers are more like Rust's raw pointers: they only have to be valid upon use not at creation.

2. References in Rust are transient (as noted in the blogpost) so holding a mut ref to T means you also hold a mut ref to all its fields/subfields semantically. If you're doing intrusive or self-referential data structures, it often requires having UnsafeCell fields to soundly create isolated mut refs from top-level shared refs. Problem being that core, language-level traits in Rust like Iterator and Future (generated by async blocks) take mut refs so implementing them (which is practically useful) on types with intrusive fields potentially being used elsewhere is UB [1]. This doesn't exist in C with no `dereferenceable` & opt-in `restrict`. It's still an unresolved issue in Rust though [2] where they had to disable LLVM annotations on problematic types/traits to avoid miscompilations [3]. Some of these footguns can be avoided by not using references and the core language traits (like the blogpost did), but they found that to not be a great programming experience.

3. Because of `dereferenceable` (again) instances of a type must be valid in-memory representations at all times, even when unused [4]. If you want invalid/uninit representations, you wrap the type in `MaybeUninit` which is fairly unergonomic. C doesn't have this issue as its only UB to deref invalid pointers or branch on invalid values (same case in Rust), not have invalid values at all.

[0]: https://github.com/rust-lang/rust/issues/94133

[1]: https://gist.github.com/Darksonn/1567538f56af1a8038ecc3c664a...

[2]: https://github.com/rust-lang/rust/issues/63818

[3]: https://github.com/rust-lang/rust/pull/106180

[4]: https://doc.rust-lang.org/std/primitive.reference.html