Hacker News new | ask | show | jobs
by scott_s 4762 days ago
It's not that remote accesses are good, it's that trying to induce them can harm cache usage elsewhere. If the author at High Scalability will allow me another quibble, I'd say that actually, memory locality is still King. It's just that we have to be very careful about trying to improve it; if you try to improve locality in one place (say, induce local accesses from a socket to main memory), you may end up harming it somewhere else (more total number of accesses to main memory because now the cache is thrashing).

The NUMA bit comes in when you said "scheduling related threads on the same cpu" and "threads spread over many cpus". If you schedule related threads on the same socket (cpu), you're more likely to get local accesses. If your threads share data, then that's two good things: local memory accesses, and good cache usage. But if your threads use different data, then the fact that you have local memory accesses may not matter because you may have many more cache misses because the threads are interfering with each other.

A simpler way to think about it: shorter access to main memory does not help you if you end up doing many more total accesses.

1 comments

Do the bigtable performance characteristics look kind of like cache line ping ponging? My intuition for scenario 3 outperforming scenario 2 (100% remote vs 50% local + 50% remote) is that there are more mutations of data and therefore more interconnect traffic is required to maintain coherency across sockets.