| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by viega 306 days ago

The kernel is responsible for maintaining the wait queues, and making sure that there is no race condition on the state that should preclude queueing.

It does not care how you use the queue, at all. It doesn't have to be done with a locking primitive, whatsoever. You absolutely can use the exact same mechanism to implement a thread pool with a set of dormant threads, for instance.

The state check in the basic futex is only done to avoid a race condition. None of the logic of preventing threads from entering critical sections is in the purview of the kernel, either. That's all application-level.

And most importantly, no real lock uses a futex for the locking parts. As mentioned in the article, typically a mutex will directly try to acquire the lock with an atomic operation, like an atomic fetch-and-or, fetch-and-add, or even compare-and-swap.

A single atomic op, even if you go for full sequential consistency (which comes w/ full pipeline stalls), is still a lot better than a trip into the kernel when you can avoid it.

Once again, I'm not saying you couldn't use the futex state check to decide what's locked and what's not. I'm saying nobody should, and it was never the intent.

The intent from the beginning was to separate out the locking from the waiting, and I think that's pretty clear in the original futex paper (linked to in my article).

2 comments

senderista 306 days ago

I like to think of a futex as the simplest possible condition variable, where the predicate is just the state of the memory word (note that a mutex guarding the predicate is unnecessary since the word can be read and written atomically). It turns out that this is simple enough to implement efficiently in the kernel, yet expressive enough to implement pretty much any userspace synchronization primitive over it.

link

gpderetta 306 days ago

You are of course completely right. In fact sometimes I wish that the kernel would do slightly more with the memory location, like optionally reserving a bit to show the empty/non empty state of the queue: the kernel should be able to keep it up to date cheaply as part of the wait/wake operations while is more complicated for userspace.

link