| > What I am wondering is what benefit does it bring in practice. Single-threaded program with shared-ptr's using atomics vs shared-ptr's using WORDs seem like a non-problem to me - e.g. I doubt it has a measurable performance impact. I mean, the blog post basically starts with an example where the performance impact is noticeable: > I found that my Rust port of an immutable RB tree insertion was significantly slower than the C++ one. And: > I just referenced pthread_create in the program and the reference count became atomic again. > Although uninteresting to the topic of the blog post, after the modifications, both programs performed very similarly in the benchmarks. So in principle an insert-heavy workload for that data structure could see a noticeable performance impact. > Atomics are slowing down the program only when it comes to contention, and single-threaded programs can't have them. Not entirely sure I'd agree? My impression is that while uncontended atomics are not too expensive they aren't exactly free compared to the corresponding non-atomic instruction. For example, Agner Fog's instruction tables [0] states: > Instructions with a LOCK prefix have a long latency that depends on cache organization and possibly RAM speed. If there are multiple processors or cores or direct memory access (DMA) devices, then all locked instructions will lock a cache line for exclusive access, which may involve RAM access. A LOCK prefix typically costs more than a hundred clock cycles, even on single-processor systems. This also applies to the XCHG instruction with a memory operand. And there's this blog post [1], which compares the performance of various concurrency mechanisms/implementations including uncontended atomics and "plain" code and shows that uncontended atomics are still slower than non-atomic operations (~3.5x if I'm reading the raw data table correctly). So if the atomic instruction is in a hot loop then I think it's quite plausible that it'll be noticeable. [0]: https://www.agner.org/optimize/instruction_tables.pdf [1]: https://travisdowns.github.io/blog/2020/07/06/concurrency-co... |