Hacker News new | ask | show | jobs
by avodonosov 300 days ago
Can anyone suggest a good explanation of memory barriers?
5 comments

Shameless plug: https://lwn.net/Articles/844224/ is in my opinion exactly the parts that are missing from the book that the author criticizes. I focus on how threads synchronize, independent of the actual primitives you use, and then explain how that synchronization is usually realized.
Thank you very much.

I can not give you thi final feedback at the moment, I only breefly looked through the articles for not.

The first ones are very accessible (given my prior lnowledge of lamport clocks and happens before as in Java memory model), the later ones I am currently not sure are very clear.

But are easier than the docs I used when first approached this topic in the past, like Documentation/memory-barriers.txt and the Doug Lea's texts.

* for not.

for now

If you mean memory barriers in terms of concurrency, it's just a primitive for concurrency that counts downward atomically once per participant (e.g. a group of threads) and then each atomically waits until the counter reaches zero before continuing. It's used to synchronize (i.e. put into lockstep) two concurrent processes such that they must all wait at a given point before continuing more or less all at once, often as part of a larger process.

If you mean a barrier in terms of a memory "fence", that's an event on CPUs whereby any pending memory instructions that have been pipelined and thus not committed are forced to commit and complete before continuing. Usually only relevant for a single core, but they're used to make sure that other cores will see the same memory values and your pending writes would reflect (or, conversely, sometimes making sure your own core sees the reads from other cores as fresh as possible before the actual read op).

Thank you for the comment. I mean fences.

Haven't ever heard of barriers as a counter-like primitive (sounds like a semaphore or CountDownLatch)

Paul McKenney's "Memory Barriers: a Hardware View for Software Hackers" is excellent, with Preshing's blog (preshing.com) offering more approachable explanations for beginners.
Thank you
"Memory Barriers: a Hardware View for Software Hackers"

https://www.researchgate.net/publication/228824849_Memory_Ba...

Thank you
tl;dr:

In a multi-threaded context, memory reads and writes can be reordered by hardware. It gets more complicated with shared cache. Imagine that you have core 1 writing to some address at (nearly) the same time that core 2 reads from that. Does core 2 read the old or the new? Especially if they don't share the same cache -- core 1 might "write" to a given address, but it only gets written to core 1's cache and then "scheduled" to be written out to main memory. Meanwhile, later core 2 tries to read that address, it's not in its cache, so it pulls from main memory before core 1's cache has flushed. As far as core 2 is concerned, the write happened after it read from the address even though physically the write finished in core 1 before core 2's read instruction might have even started.

A memory barrier tells the hardware to ensure that reads-before is also "happens-before" (or after) a given writen to the same address. It's often (but not always) a cache and memory synchronization across cores.

I found Fedora Pikus's cppcon 2017 presentation [1] to be informative, and Michael Wong's 2015 presentation [0] filled in some of the gaps.

C++, being a generic language for many hardware implementations, provides a lot more detailed concepts for memory ordering [2], which is important for hardwares that have more granularity in barrier types that what most people are used to with x86-derived memory models.

[0]: https://www.youtube.com/watch?v=DS2m7T6NKZQ

[1]: https://www.youtube.com/watch?v=ZQFzMfHIxng

[2]: https://en.cppreference.com/w/cpp/atomic/memory_order.html

Than you.