Hacker News new | ask | show | jobs
by pcwalton 3489 days ago
Rayon is definitely the best parallelism library I've ever used. We recently switched Servo over to using it for parallel restyling and layout and saw small gains in performance over our previous solution and drastic reduction in code complexity (and removed a whole pile of domain-specific unsafe code).

Being able to switch .iter() to .par_iter() and have things "just work" is a game changer.

The crucial thing about rayon is that sequential fallback is really fast, almost as fast as the sequential code you'd write anyway. This is important because, as paradoxical as it sounds, most CPU-bound programs work with small workloads most of the time, and so they don't want the overhead of parallelism for those cases. (It's the analogue of saving power by putting the CPU to sleep when it's not in use.) The occasional big workload that comes along is what you really want parallelism for, and the big trick is to handle that case without regressing the common sequential case. Rayon's work stealing approach based around scoped iterators is the ideal solution for this.

3 comments

> Being able to switch .iter() to .par_iter() and have things "just work" is a game changer.

It's called .parallel() in D, works the same way I guess. It turns a lazy computation chain into a parallel one.

I have played around a tiny bit with par_iter over blocking IO tasks and seen some sched_yield() loops burning CPU time instead of backing off to futex_wait. That seems suboptimal and not exactly "the best ever" I'd expect from a parallelism library.
They should be doing that for a few iterations before backing off. Otherwise you end up with bad scheduling leading to slow warmups, among other problems.

You shouldn't use rayon for blocking I/O; that's not what it's designed for. Rayon is a parallelism library, not a concurrency library.

> Rayon is definitely the best parallelism library I've ever used.

Whenever people say this, I ask the following question in order to gauge whether I want to try the library in question:

Have you used Twisted?

Twisted is, to me, the quintessential example of a high-quality open source project. If you have used it extensively and still recommend Rayon, I'll give it a try.

Assuming you're talking about the python library, it is more going for asynchronous IO and other more "concurrency" things than the data parallelism that rayon is designed for.
Yeah, twisted is closer to tokio than rayon.
Hey Steve. We met at TwilioCon 2011 - not sure if you remember that. How have you been?

Is there a good guide for someone in my position? ie, to learn about tokio and rayon, having used python (for, in this case, concurrency and data science respectively)?

Are you mostly using Rust these days?

Oh hey! That was a very long time ago, but I loved TwilioCon. Things are good. I'm actually working on Rust full-time, so yeah, I use it a lot. :)

I'm not sure there's a great guide yet, because a lot of this stuff is still shaking out. The Rust ecosystem in general is growing at a pretty steady clip, and new stuff pops up all the time: tokio is less than a year old, for example.

There's two different kinds of problems here: "I found a library, what does it do?" and "What libraries exist?" In the former case, you're at the mercy of the library author to give you a good description. With the latter, one of the better ways is to drop by #rust on IRC, or post to users.rust-lang.org, asking for an overview of what exists. https://crates.io/search is also helpful.

In this case, rayon is for "data parallelism", meaning "I have some data, I would like to do some work on it, and I'd like to make that paralell." Tokio is about asynchronous I/O.

Twisted is not for parallelism. Twisted is for concurrency.