|
> That is a good point: the x86 model is "stronger" than acquire-release. Which is probably why it took so long for acquire-release to become beneficial. Any x86 coder who codes in acquire-release will not see any performance benefit on x86, because x86 implements "stronger guarantees" at the hardware level. Yes, in a way it's a race to the bottom; code that works on TSO hw works on acquire-release hw, but not the other way around. There's only two ways to combat this race: education, and using concurrency libraries written by people who know what they're doing. > acquire-release is becoming far more popular and is the golden-standard that C++11 has more or less settled upon Hmm, how come? C++11 supports many different models, relaxed, acquire/release, and sequential consistency, with sequential consistency being the default for atomic variables. Now, acquire/release looks like a decent compromise between ease of hw implementation and programming complexity, but AFAICS it's not the anointed one true model. To some extent I think that's a failing of the C++11 model. Instead of choosing one (sane) model, they made people choose between an array of models with subtle semantics. That's what the recent formal Linux kernel model did, although that's not ideal either, with the requirement to not be too different from the previous informal description and boatloads of legacy code. See http://www0.cs.ucl.ac.uk/staff/j.alglave/papers/asplos18.pdf In general, it seems to me that progress is being made in formal memory models, and I hope that in some years time there will be some kind of synthesis giving us a model that is both reasonably easy to implement in hw with good performance, easy enough to reason about, as well as formally provable. We'll see. |
Well, nothing will ever be "officially" blessed as the one true model. As the saying goes: we programmers are like cats, we all will be moving off in our own direction, doing our own thing.
Overall, I just think that "programmer culture" is beginning to settle down on Acquire-release semantics. Its just a hunch... but more-and-more languages (C++, CUDA), and systems (ARM, POWER, NVidia GPUs, AMD GPUs) seem to be moving towards Acquire-release.
And in the next few years, we'll have cache-coherency over PCIe 4.0 or PCIe 5.0 in some form (CXL or other protocols on top of it). A unified memory model across CPU, DDR4 RAM, the PCIe-bus, and co-processors (GPUs, FPGAs, or Tensor cores), and high-speed storage (Optane and Flash SSDs over NVMe) is needed.
The community is just a few years out from having a unified memory model + coherent caches across the I/O fabric. Once this "defacto standard" is written, it will be very hard for it to change. That's why I think acquire-release is here to stay for the long term. Its the leading memory model right now.