Hacker News new | ask | show | jobs
by cafxx 1093 days ago
A point that AFAICT is not articulated in the article is why the two cached fields should be in their own dedicated cache line (e.g. why readIdxCached_ can not share the cache line with writeIdx_).
2 comments

They should be in the same cacheline as you suggest. The writer will always touch both writeIdx and readIdxCached so it makes sense to be together. Splitting them won't affect synthetic benchmarks, but it just wasting space in a larger application.
There might be an additional optimization in having the writer also cache it's write index on the cache line together with the read index cache. This way the writer would only do writes to the write index cache line. The hardware might be able to optimize this. I wonder how it interacts with UMWAIT on the latest cores.
Reading its own values is never a problem, the writeIndex cacheline will switch from E (or M) to S (or O), but it is never evicted, so reads from it never miss. For frequent writes the read will be fulfilled by the store buffer anyway.
I think it will be invalidated due to RFO when the reader reads the write index. Only when multiple readers reads the same cache line without any intervening write will the RFO heuristic be disabled.
False sharing perhaps? At least that is my shoot-from-the-hip response.

https://stackoverflow.com/questions/22766191/what-is-false-s...

I don’t get it. Both readIdx_ and writeIdxCached_ are “owned” by the consumer. (That being the point of a “cached” copy.) The producer might read readIdx_ occasionally, whenever we go through the whole buffer, but otherwise only the consumers access these. So why should we avoid putting them on the same cache line?
Yeap. The linked cpp ref doc has a nice example:

https://en.cppreference.com/w/cpp/thread/hardware_destructiv...

In the example you linked it's comparing two threads reading from the same or separate cache lines, no? If so, that's not really the point I was referring to (as the two variables I mentioned as example are accessed by a single thread, not by two threads).
false sharing is the reason for splitting readIdx and writeIdx. But there is no reason to split readIdx and writeIdxCached.