Hacker News new | ask | show | jobs
by brandmeyer 2358 days ago
> Not sure but there may be two bugs in the code. The use of notify on the condition outside of the mutex lock.

That's not a bug. The notifier can either signal the condition variable with or without the corresponding lock held and both cases are race-free. In some implementations, it may be more performant to signal the condvar with the lock still held, while in others it is more performant to signal the condvar after releasing the lock.

In this discussion "implementations" isn't referring to boost, libstdc++, or LLVM's glue library, "implementations" is referring to the underlying system libraries.

Alas, most of the implementations aren't willing to tell you which one is better.

1 comments

I understand the case where it's more performant to notify outside of the lock, so that the notified thread doesn't wake up and immediately block again waiting for the notifying thread to release the lock.

What would the case where it's more performant to notify while holding the lock look like? Is the underlying implementation somehow able to transfer ownership of the mutex?

I must admit that I could be mistaken. There was a stack overflow question about this very topic wherein an NPTL implementer spoke up and claimed the following (any errors in this are the fault of my own recollection):

> There is at least one implementation that takes advantage of the requirement that a condition variable be associated with a unique lock to make fewer futex(2) calls, such that the signaling and woken thread make exactly one system call each. NPTL does this.

However, I cannot find that Q&A pairing any more. It is certainly the case on the RTOS I'm using right now that it is more optimal to signal outside the lock and rely on the fact that uncontended locking and unlocking operations don't make passes through the scheduler.

You are thinking of FUTEX_*_REQUEUE (see the man page for details). It will move (some of) the waiters from the condvar futex to the mutex futex. IIRC, the optimization is called wait morphing.

IIRC it is so hard to get it right in practice (the number of races and corner cases is staggering) that recent libc versions might have stopped doing it. I might be misremembering though.

I read somewhere that pthreads can transfer ownership.