Hacker News new | ask | show | jobs
by wallacoloo 2311 days ago
> Rust interops with C seamlessly, doesn't it?

From someone who works in a mixed C + Rust codebase daily (Something like 2-3M lines of C and 100k lines of Rust), yes and no. They're pretty much ABI compatible, so it's trivial to make calls across the FFI boundary. But each language has its own set of different guarantees it provides and assumes, so it's easy to violate one of those guarantees when crossing a FFI boundary and triggering UB which can stay hidden for months.

One of them is mutability: in C we have some objects which are internally synchronized. If you call an operation on them, either it operates atomically, or it takes a lock, does the operation, and then releases the lock. In Rust, this is termed "interior mutability" and as such these operations would take non-mutable references. But when you actually try that, and make a non-mutable variable in Rust which holds onto this C type, and start calling C methods on it, you run into UB even though it seems like you're using the "right" mutability concepts in each language. On the rust side, you need to encase the C struct inside of a UnsafeCell before calling any methods on it, which becomes not really possible if that synchronized C struct is a member of another C struct. [1]

Another one, although it depends on how exactly you've chosen to implement slices in C since they aren't native: in our C code we pass around buffer slices as (pointer, len) pairs. That looks just like a &[T] slice to Rust. So we convert those types when we cross the FFI boundary. Only, they offer different guarantees: on the C side, the guarantee is generally that it's safe to dereference anything within bounds of the slice. On the rust side, it's that, plus the pointer must point to a valid region of memory (non-null) even if the slice is empty. It's just similar enough that it's easy to overlook and trigger UB by creating an invalid Rust slice from a (NULL, 0) slice in C (which might be more common than you think because so many things are default-initialized. a vector type which isn't populated with data might naturally have cap=0, size=0, buf=NULL).

So yeah, in theory C + Rust get along well and in practice you're good 99+% of the time. But there are enough subtleties that if you're working on something mission critical you gotta be real careful when mixing the languages.

[1] https://www.reddit.com/r/rust/comments/f3ekb8/some_nuances_o...

1 comments

> On the rust side, it's that, plus the pointer must point to a valid region of memory (non-null) even if the slice is empty.

Do you have a citation for that, because it seems obviously wrong[0] (since the slice points to zero bytes of memory) and I'm having trouble coming up with any situation that would justify it (except possibly using a NULL pointer to indicate the Nothing case of a Maybe<Slice> datum)?

0: by which I mean that Rust is wrong to require that, not that you're wrong about what Rust requires.

Well the docs have this to say [1]:

`data` must be non-null and aligned even for zero-length slices. One reason for this is that enum layout optimizations may rely on references (including slices of any length) being aligned and non-null to distinguish them from other data. You can obtain a pointer that is usable as data for zero-length slices using NonNull::dangling().

So yes, this requirement allows optimizations like having Option<&[T]> be the same size as &[T] (I just tested and this is the case today: both are the same size).

I'm not convinced that it's "wrong", though. If you want to be able to support slices of zero elements (without using an option/maybe type) you have to put something in the pointer field. C generally chooses NULL, Rust happens to choose a different value. But they're both somewhat arbitrary values. It's not immediately obvious to me that one is a better choice than the other.

[1] https://doc.rust-lang.org/std/slice/fn.from_raw_parts.html

> [1] https://doc.rust-lang.org/std/slice/fn.from_raw_parts.html

Thanks.

> having Option<&[T]> be the same size as &[T]

That is literally what I mentioned as a possible reason ("except possibly ..."), but what I overlooked was that you could take a mutable reference to the &[T] inside a Option<&[T]>, then store a valid &[T] into it - if NULL is allowed, you effectively mutated the discriminant of a enum when you have active references to its fields, violating some aspect of type/memory safety, even I'm not sure which.

> C generally chooses NULL, Rust happens to choose a different value.

It's not about what pointer value the langauge chooses when it's asked to create a zero-length slice, it's about whether the language accepts a NULL pointer in a zero-length slice it finds lying around somewhere.