Hacker News new | ask | show | jobs
by crazytalk 1370 days ago
It's mentioned in the readme - this is measuring the latency of cache coherence. Depending on architecture, some sets of cores will be organized with shared L2/L3 cache. In order to acquire exclusive access to a cache line (memory range of 64-128ish bytes), caches belonging to other sets of cores need to be waited on to release their own exclusive access, or to be informed they need to invalidate their caches. This is observable as a small number of cycles additional memory access latency that is heavily dependent on hardware cache design, which is what is being measured

Cross-cache communication may simply happen by reading or writing to memory touched by another thread that most recently ran on another core

Check out https://en.wikipedia.org/wiki/MOESI_protocol for starters, although I think modern CPUs implement protocols more advanced than this (I think MOESI is decades old at this point)

1 comments

AMD processors also use a hierarchical coherence directory, where the global coherence directory on the IO die enforces coherence across chiplets and a local coherence directory on each chiplet enforces coherence on-die http://www.cs.cmu.edu/afs/cs/academic/class/15740-f03/www/le...