|
|
|
|
|
by thyristan
266 days ago
|
|
The only valid benchmark for that is to use the application you intend to use as a benchmark. Even embarassingly parallel problems can have different characteristics depending on their use of memory and caches and the thermal characteristics of the CPU. Something that uses only L1 cache and registers will probably scale almost linearly in the number of cores, except for thermal influences. Something that uses L2, L3 caches or even main memory will be sublinear. |
|
Suppose I run many different kinds of applications and am just looking for an overall score to provide a general idea of how two machines compare with one another. That's supposed to be the purpose of these benchmarks, isn't it? But this one seems to be unusually useless at distinguishing between various machines with more than a small number of cores.
Your analysis is also incorrect for many of these systems. Each core may have its own L2 cache and each core complex may have its own L3, so systems with more core complexes don't inherently have more contention for caches because they also have more caches. Likewise, systems with more cores often also have more memory bandwidth, so the amount of bandwidth per core isn't inherently less than it is in systems with fewer cores, and in some cases it's actually more, e.g. a HEDT processor may have twice as many cores but four times as many memory channels.