|
|
|
|
|
by mtdewcmu
4762 days ago
|
|
I'm having a little trouble making sense of this: "For example, bigtable beneļ¬ts from cache sharing and would prefer 100 % remote accesses to 50% remote. Search-frontend prefers spreading the threads to multiple caches to reduce cache contention and thus also prefers 100 % remote accesses to 50% remote." Let me see if I've got this straight: * bigtable benefits from scheduling related threads on the same cpu so they can share a cache, I'm guessing because multiple threads work on the same data simultaneously * search benefits from having its threads spread over many cpus, probably because the threads are unrelated to each other and not sharing data, so they like to have their own caches I'm not sure I understand how this relates to NUMA, or why remote accesses are ever a good thing. Maybe it requires a more sophisticated understanding of computer architecture than what I have. |
|
The NUMA bit comes in when you said "scheduling related threads on the same cpu" and "threads spread over many cpus". If you schedule related threads on the same socket (cpu), you're more likely to get local accesses. If your threads share data, then that's two good things: local memory accesses, and good cache usage. But if your threads use different data, then the fact that you have local memory accesses may not matter because you may have many more cache misses because the threads are interfering with each other.
A simpler way to think about it: shorter access to main memory does not help you if you end up doing many more total accesses.