> Not sure but there may be two bugs in the code. The use of notify on the condition outside of the mutex lock.
That's not a bug. The notifier can either signal the condition variable with or without the corresponding lock held and both cases are race-free. In some implementations, it may be more performant to signal the condvar with the lock still held, while in others it is more performant to signal the condvar after releasing the lock.
In this discussion "implementations" isn't referring to boost, libstdc++, or LLVM's glue library, "implementations" is referring to the underlying system libraries.
Alas, most of the implementations aren't willing to tell you which one is better.
I understand the case where it's more performant to notify outside of the lock, so that the notified thread doesn't wake up and immediately block again waiting for the notifying thread to release the lock.
What would the case where it's more performant to notify while holding the lock look like? Is the underlying implementation somehow able to transfer ownership of the mutex?
I must admit that I could be mistaken. There was a stack overflow question about this very topic wherein an NPTL implementer spoke up and claimed the following (any errors in this are the fault of my own recollection):
> There is at least one implementation that takes advantage of the requirement that a condition variable be associated with a unique lock to make fewer futex(2) calls, such that the signaling and woken thread make exactly one system call each. NPTL does this.
However, I cannot find that Q&A pairing any more. It is certainly the case on the RTOS I'm using right now that it is more optimal to signal outside the lock and rely on the fact that uncontended locking and unlocking operations don't make passes through the scheduler.
You are thinking of FUTEX_*_REQUEUE (see the man page for details). It will move (some of) the waiters from the condvar futex to the mutex futex. IIRC, the optimization is called wait morphing.
IIRC it is so hard to get it right in practice (the number of races and corner cases is staggering) that recent libc versions might have stopped doing it. I might be misremembering though.
Per [1] the lock does not need to be held when calling notify. In the 17 standard this is §33.5 but I can't quite parse out the real requirements there.
In this case I think it should be ok because !this->tasks.empty() is part of the predicate, and will be true after a new item is enqueued (and before notify is called). So if the producer is interrupted between releasing the mutex and calling notify, the consumer would not call wait.
But yeah, in general I agree with the advice.. I'm pretty sure I've made that mistake before.
It depends on if the predicate can (also) be changed without the lock being held, as otherwise the release/acquire ordering semantics are not guaranteed. So long as items are only every enqueued with the lock held, it should be ok.
"Even if the shared variable is atomic, it must be modified under the mutex in order to correctly publish the modification to the waiting thread."
That's not normative but I can't be bothered to check the c++ standard for exact wordings. Also SuSv4 doesn't have an explicit prohibition on modifying the predicate outside a critical section. But if you do, you are truly on your own.
It is because condition variables have no state, i.e. calling signal without waiters is equivalent to doing nothing and does not affect a future wait. The concern is that a thread is switched out after it checks the condition but before it calls wait, and in that moment the condition variable is signaled. The thread then calls wait and is not awoken (potentially never to be awoken). If the predicate is not altered without the mutex and the waiting thread has the mutex at the time it checks the predicate (this latter part is a must in all cases), then it is guaranteed the predicate will remain unmet at least until the thread has switched to a waiting state.
So simply using an atomic (even with a full release-acquire memory barrier) is not necessarily sufficient as that would only enforce the coherence but not prevent the scheduler from interrupting at the wrong moment.
> its usually a bug to not hold the lock before notify()
All correct non-realtime uses uses of condition variables should work fine with notify called outside of the critical section.
The only case it matters is when using strict realtime scheduling (SCHED_FIFO and SCHED_RR) and it is important that higher priority threads are woken up before lower priority ones.
The exception I guess is when paired with a condition variable, since CV waits are not guaranteed spurious and necessarily must be combined with another condition. The caveat being that the variable being tested upon wake must not be modified without the lock held (presumably to prevent reordering).
The C++ spec simplifies what is basically an abstraction over the posix layer. I dug into this very deeply a long time ago, there’s a lot nuance there and documentation is confusing: https://github.com/neosmart/pevents/issues/2
That's not a bug. The notifier can either signal the condition variable with or without the corresponding lock held and both cases are race-free. In some implementations, it may be more performant to signal the condvar with the lock still held, while in others it is more performant to signal the condvar after releasing the lock.
In this discussion "implementations" isn't referring to boost, libstdc++, or LLVM's glue library, "implementations" is referring to the underlying system libraries.
Alas, most of the implementations aren't willing to tell you which one is better.