Hacker News new | ask | show | jobs
by electricshampo1 1812 days ago
"Java and JavaScript have avoided introducing weak (acquire/release) synchronizing atomics, which seem tailored for x86."

This is not true for Java; see

http://gee.cs.oswego.edu/dl/html/j9mm.html

https://docs.oracle.com/en/java/javase/16/docs/api/java.base...

1 comments

Its not true in general. x86 CANNOT have weak acquire/release semantics. x86 is "too strong", you get total-store ordering by default.

If you want to test out weaker acquire/release semantics, you need to buy an ARM or POWER9 processor.

ARMv7 or earlier it appears. On ARMv8 with direct hw support for SC atomics, the SC atomics are the suggested implementation of acq/rel too. See the ARMv8 section of https://www.cl.cam.ac.uk/~pes20/cpp/cpp0xmappings.html.

As I mentioned in the post (https://research.swtch.com/plmm#sc), Herb Sutter claimed in 2017 that POWER was going to do something to make SC atomics cheaper. If it did, then that might end up being cheaper than the old sync-based acq/rel too, same as ARM, in which case we'd end up with SC = acq/rel on both ARM and POWER. It looks like that didn't happen, but I'd be very interested to know what did, if anything.

I would say that acquire/release map very well to x86 (were they are free). Technically x86 is slightly stronger as it doesn't allow IRIW, but seq cst is too expensive to implement by default.

Conversely acq/rel are from somewhat to very expensive to implement on ARM/POWER.

Acq/rel are nonsense on x86, worse than a NOP. It compiles down into nothing.

x86 cannot specify a load/store any more relaxed than total-store ordering (which is even "stronger" than acquire/release)

ARM / POWER9 were originally "consume/release". But upon C++11, the agreement was that consume/release was too complicated, and acquire/release model was created instead.

Java was the granddaddy of modern memory models but focused on Seq-Cst (the strongest model: the one that makes "sense" to most programmers). C++ inherited Java's seq-cst, but recognized that low-level programmers wanted something faster: both "fully relaxed" and acq/rel as the two faster ways to load/store.