Hacker News new | ask | show | jobs
by twoodfin 818 days ago
Right, so does Intel in at least their high end chips. But a count of last-level misses is just one factor in the cost formula for memory access.

I appreciate it’s a complicated and subjective measurement: Hyperthreading, superscalar, out-of-order all mean that a core can be operating at some fraction of its peak (and what does that mean, exactly?) due to memory stalls, vs. being completely idle. And reads meet the instruction pipeline in a totally different way than writes do.

But a synthesized approximation that could serve as the memory stall equivalent of -e cycles for perf would be a huge boon to performance analysis & optimization.