Shared memory is way outside the scope of standard C or C++. It's implementation-defined. It's inconsistent to insist on the weakest definition of atomics allowed by the C/C++ standard(s) and simultaneous invoke one of the weirdest implementation-defined mechanisms defined by POSIX. If your implementation provides shared memory of some kind, it's up to your implementation to define some sort of reasonable semantics.
In POSIX' case, it's up to POSIX operating systems to define reasonable semantics on the memory, using constructs like PTHREAD_PROCESS_SHARED and "robust" pthread mutexes.
> C++ atomics are no good here, because they are not guaranteed to be lock free or address free.
That's not right; you can still use std::memory_order to get the memory barriers generated that are required. These are going to obviously be lock free, they deal with memory ordering—what you tried to deal with volatile, but in general case.
In POSIX' case, it's up to POSIX operating systems to define reasonable semantics on the memory, using constructs like PTHREAD_PROCESS_SHARED and "robust" pthread mutexes.