|
|
|
|
|
by 50lo
114 days ago
|
|
With packages like this (lots of cores, multi-chip packaging, lots of memory channels), the architecture is increasingly a small cluster on a package rather than a monolithic CPU. I wonder whether the next bottleneck becomes software scheduling rather than silicon - OS/runtimes weren’t really designed with hundreds of cores and complex interconnect topologies in mind. |
|
We had a massive performance issue a few years ago that we fixed by mapping our processes to the numa zones topology . The default design of our software would otherwise effectively route all memory accesses to the same numa zone and performance went down the drain.