Hacker News new | ask | show | jobs
by fyi1183 2916 days ago
I suspect that something like a heap implementation could use this. For concurrency, you want different cores to use different pools to avoid atomics. In practice, this means per-thread pools are used today, but this rseq feature seems like it would allow using per-core pools instead. That would save memory and probably be even better for cache locality when a core is shared by multiple threads.
1 comments

I use higher-level APIs built on top of restartable sequences. Here's my understanding (could be wrong):

> I suspect that something like a heap implementation could use this.

Indeed. Let's say you want to have lots and lots and lots of threads, as described in the video schmichael linked. [0] Per-thread malloc pools become less attractive:

* too empty (lots of contention for the global pool) or * too full (lots of wasted RAM, probably poor CPU cache utilization as well) or * lots of sloshing

More generally, people sometimes do per-thread stuff to avoid lock contention. Some types of state might be reasonable to keep per-thread when the program is written in a thread-per-core / async style but might not be it's written in a thread-per-request / sync style. It might use too much RAM. If you ever have to access _all_ the threads' state (say, if you are doing some counters for a monitoring system: increment just the current thread's state on write; sum them on read), that path might get ridiculous. So per-CPU might work better.

Per-CPU stuff doesn't require restartable sequences. You can just use the CPU number to decide which shard to access then lock it or use atomics as you would with global state. You get less lock contention and cache-line bouncing. (Alternatively, you might get some of these benefits by picking a shard randomly, if the rng is cheap enough. Or a counter.)

Restartable sequences let you entirely avoid atomic operations for per-cpu stuff.

[0] https://www.youtube.com/watch?v=KXuZi9aeGTw