Hacker News new | ask | show | jobs
by dthul 1364 days ago
Smart pointers and other managed types are nice, but memory errors are still easy to make, e.g. pushing into a std::vector from several threads.
1 comments

Well, yeah, but that's not really a memory error - it's an error of not using mt-safe data structures for mt programming. It's no different than using int vs std::atomic<int> in an mt context.

It's a shame that C++ doesn't provide mt-safe versions of STL types in the standard library, although trivial to wrap them yourself (1 line of code per method, using std::lock_guard).

Are all Rust data types mt-safe? Does unsafe mode provide faster unprotected versions if you need those?

> Are all Rust data types mt-safe?

Yes, but not in the sense you probably think, given your second sentence:

> Does unsafe mode provide faster unprotected versions if you need those?

Some data types can be safely shared between threads, and some cannot. Rust checks at compile time if you try and use a non-thread safe data structure from multiple threads, and if you do, will give you an error. So in that sense, all of them are safe, yes.

You don’t use unsafe to get access to non-thread safe data structures, you may have both kinds, and the compiler checks you use them correctly.

Interesting. So I assume there are at least both basic and thread-safe versions of basic types like string/list/vector/map ?

Is there any provision for building your own thread-safe types (e.g. a structure composed of other types) out of non-thread-safe types and mutexes, and if so how does that work in terms of compile-time errors ?

> I assume there are at least both basic and thread-safe versions of basic types like string/list/vector/map ?

To my knowledge, no. If you want to push to a vector from multiple threads, you "wrap" the vector in a `Mutex`. The difference between C++'s std::mutex or std::lock_guard and Rust's Mutex, is that the compiler refuses to let you touch the data protected by the mutex unless you have a lock acquired on the mutex.

> Is there any provision ...

Yes!

1. In Rust, types can implement "traits".

2. There's two traits that control thread safety: `Send`, and `Sync` Basically, any type that "implements the trait" `Send` is safe to send across threads. And a reference of any type that implements `Sync` is safe to to send across threads.

3. `Send` and `Sync` are "auto-traits", which means that if you make a data structure out of primitives that all implement `Send`, your data structure will also implement `Send`, same for `Sync`.

4. There are a bunch of thread-safety primitives that you can use (like `Arc` (atomic reference counters), `Mutex`, ..etc) to build thread safe data structures.

> how does that work in terms of compile-time errors?

The compiler will not let send a type across threads if it doesn't implement `Send`, and it won't let you send a reference to a type if the type doesn't implement `Sync`!

That way, if you avoid using the `unsafe` keyword, and the compiler agrees to compile your code, you can be sure you won't have data races!

There are mutexes & other synchronization primitives. The stdlib doesn't ship a thread safe Vec or HashMap, but there are many on offer from the community. Rust's stdlib is small; lots of functionality you might expect is in community provided libraries (at least presently - I imagine thread safe collections like that will be added in the future, when the community comes to a consensus about how they should look). This works out better than you might expect, and avoids hazards that say Python encounters, where parts of the stdlib are so bad there is no safe use case (looking at you, `csv.Sniffer`).

In Rust, there are traits (you might know them as interfaces or protocols from other languages) called Send and Sync which tell the language whether or not something can be send to a different thread, or shared between threads.

Vec<T> is Send but not Sync - you could pass ownership of it to a different thread, surrendering your access to it in the process, but two threads couldn't share access to it. Almost everything is Send, unless it contains references to some thread local state.

Mutex<Vec<T>> is Sync - you can share it between threads. Basically if you take the lock, you'll get back a smart pointer to your Vec<T>.

These traits are generally implemented automatically; you can implement them on a type yourself, using unsafe, but you'd only do that if you were writing your own synchronization primitives or thread safe data structures.

So; the compiler is able to infer which types are and are not thread safe, and what flavor of thread safety it has. It's then able to use that to check for thread safety violations at compile time.

There's more to it than that, in particular it's possible to share a read-only reference to a Vec<T> between threads (with a compile time guarantee that no one has a mutable reference to it), but I'd refer you to the book[1] or to other articles on HN of you wanted the specifics.

[1] https://doc.rust-lang.org/stable/book/