|
|
|
|
|
by dragontamer
1555 days ago
|
|
> The basic problem is that for any physical core, there is no guarantee that any writes will ever be seen by any other core (unless the necessary extra magic is done to make this so). They're using atomic read/writes with sequential-consistency. This means that the compiler will automatically put memory-barriers in the appropriate locations to guarantee sequential-consistency, but probably at a significant performance cost. The implementation is likely correct, but its the slowest performance available (seq-cst is the simplest / most braindead atomic operation, but slowest because you probably don't need all those memory barriers). I'd expect that Acq-release style barriers is possible for this use case, which would be faster on x86, POWER, and ARM systems. |
|
In the C11 code? I'm not up to speed with that version of the spec, but unless there is behaviour invoked by the use of _Atomic, the code looks to me to be performing normal, non-atomic read/writes.
Another reply says there are full barriers being automatically used, but that doesn't address the problem I've described.