| Great write-up! A few notes worth mentioning (IMO) below: Garbage collection:
This is only true in absence of garbage collection. If you're paying garbage collection cost to begin with, this is not an issue. Also, note things such as http://web.mit.edu/willtor/www/res/threadscan-spaa-15.pdf This only applies to dynamic lock-free data structures:
Or, data structures requiring memory allocation. If you're using bounded buffers and don't require memory allocation, this isn't an issue. Taxonomy:
Not all passive schemes are quiescent-state-based. In QSBR, quiescent points are a first-class entity while that is not the case in EBR (you have only 3 distinct states). Absent extensions you are unable to differentiate one quiescent point from another, which has real implications on being able to implement things like deferred / non-blocking reclamation efficiently. There are some advantages to this, a while ago Paul Khuong contributed a reference counting scheme to epoch reclamation allowing for overlapping protected sections (bounded epoch makes reference counting practical, something you can't do with unbounded quiescent points). It's pretty great and note, it doesn't require a strict notion of quiescence! You'll find this in Concurrency Kit. It is also worth noting the HP, etc... (pointer-based schemes) can be used to implement QSBR. For these reasons, I prefer to classify these techniques into two primary families (based on the semantics of the interface rather than the implementation): "passive schemes" and "active schemes". Passive schemes do not require cooperation from the algorithms requiring safe memory reclamation (EBR, QSBR, etc...) while "active schemes" do (HP, PTB, etc... in their textbook form require modification to the algorithm). On blocking for passive schemes:
It is worth noting that QSBR, EBR and other "passive schemes" do have "non-blocking" (informally) interfaces (rcu_call, text-book EBR utilizes limbo lists, etc...) such that writers do not have to block on the fast path, but of course, there is the cost of memory accumulating until a grace period is detected (so, not non-blocking in the formal sense but if you've the memory to spare and sufficient forward progress, becomes a non-issue). In my implementations, I typically use a dedicated thread for forward progress / epoch detection, ensuring the writer has forward progress. You have much higher granularity and the lock-free progress guarantee in hazard pointers, because it is pointer-based (tied to the forward progress guarantees of the underlying algorithm). Under-appreciated recent development:
An interesting thing to note here is there are schemes such as https://www2.seas.gwu.edu/~seotest/publications/eurosys16ps.... which do not necessitate the same heavy-weight barriers in the presence of invariant TSC+TSO and with a sufficiently high granularity TSC, can provide the same forward progress guarantees as hazard pointers. On hazard pointers being slow:
One interesting thing to note is hazard pointers can also be used to implement "passive" schemes such as proxy collection, to give similar performance properties as EBR and QSBR. On real world implementations:
It is worth mentioning http://concurrencykit.org :-P |