Hacker News new | ask | show | jobs
by dataangel 1701 days ago
I think their point is you only need compiler barriers not actual barrier instructions on x86. volatile in practice has been the de facto way to get the effect of a compiler memory barrier for a long time even though it's not the best way to do it nowadays. The original purpose of it is literally preventing the compiler from getting rid of loads and stores and reordering them which is exactly what is needed when implementing a lockless FIFO. As long as all the stores and loads (including the actual FIFO payload) are volatile, it will work (volatile loads and stores are guaranteed to not be reordered with each other). After that the x86 guarantees about not reordering are very strong. Really the best argument against volatile for this kind of thing is actually the opposite of your point, volatile is too strong. It prevents more reordering than you actually want. Acquire/release semantics are less strong and give the compiler more flexibility.
1 comments

volatile does not work as a memory barrier neither in theory nor in practice. On gcc for example you need an explicit additional compiler barrier before a volatile store and after a volatile load to implement the expected release/acquire semantics on x86.

See [1] for example the implementation of smp_store_release and smp_load_acquire in the linux kernel (barrier() is just a compiler barrier and {READ,WRITE}_ONCE are a cast to volatile).

Volatile only prevents reordering of volatile statements (and IO), not all load and stores.

[1] https://elixir.bootlin.com/linux/latest/source/tools/arch/x8...