| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by zamadatix 590 days ago
	Geekbench multi falls off a cliff after ~16 cores. E.g. the Epyc 9654 with 96 cores benches lower than the Ryzen 7950X with 16 cores of the same generation.

5 comments

tiffanyh 590 days ago

That's not an accurate comparison.

  Base Freq  Chip
  ---------  ----  
  2.4GHz     EPYC 9654
  4.7GHz     Ryzen 7950X

As a result, the SINGLE core performance difference is ~2x

  Geekbench Single Core - Score
  -----------------------------
  1,827      EPYC 9654
  2,986      Ryzen 7950X

Having such a large difference in single-core performance, will negate the sizable difference in total core count (96 vs 16).

https://browser.geekbench.com/processors/amd-epyc-9654

https://browser.geekbench.com/processors/amd-ryzen-9-7950x

link

zamadatix 590 days ago

A 1.6x single core performance difference won't negate a 6x core count advantage in peak multi core performance. The problem above is not that the cores are slower, it's that Geekbench will literally not utilize the additional cores in the first place. This compounds with what you're saying - the few cores that do get used have high clocks on the low core count optimized part but lower clocks on the high core count optimized part.

Compare this to a multithreaded benchmark that does scale to all of the cores and you'll find the higher core count CPUs are able to push significantly higher scores despite the single thread difference e.g. https://www.cpubenchmark.net/high_end_cpus.html has them at 62,711 vs 117,317 in the opposite ranking direction. That should feel about right, otherwise AMD would only make the 16 core high frequency CPUs instead of 128 core low frequency monsters.

That's not to say the Geekbench score is bad or useless. It represents a specific type of workload... just not "peak multi core performance". It's more indicative of "mixed workload performance", where the extra 2x cores on the Ultra are more apparently going to be irrelevant.

link

echoangle 590 days ago

> Having such a large difference in single-core performance, will negate the sizable difference in total core count (96 vs 16).

But why? Wouldn’t total score be approximately corecount*corescore? Of course it’s not exactly that because not all cores run full speed at the same time, but how are the cores weighted that 16 cores are better than 96 cores with half the speed each?

link

wtallis 590 days ago

Geekbench 5 and earlier constructed the multi-core test as essentially running N independent copies of the single-core test. This effectively pretends that every subtest is embarrassingly parallel. Geekbench 6 switched to having the multi-core test actually operate like real multi-threaded software: a fixed-size problem is broken up to be divided among available cores, with a non-zero amount of coordination between threads and potential for less than perfectly linear scaling because Amdahl's Law isn't being ignored.

link

echoangle 590 days ago

But that’s a very specific thing to adjust a benchmark for, what if I want to host 50 VMs on one server for example? Then the 100 core server would be much better than the 16 core server, even though it has a lower benchmark value.

link

wtallis 590 days ago

It's not a "very specific" thing to adjust a benchmark for. It's the default case for practically all consumer workloads, and Geekbench is a consumer-focused benchmark.

link

tiffanyh 590 days ago

Because multicore cpu's don't scale linearly due to NUMA.

https://en.wikipedia.org/wiki/Non-uniform_memory_access

link

michaelt 590 days ago

In addition to the single-core-clock-speed difference, the different tasks in the multicore benchmark seem to all have different performance characteristics: https://browser.geekbench.com/v6/cpu/compare/8423876?baselin...

For some reason the Epyc is radically faster for "Ray Tracer" and "Horizon Detection" but worse for "PDF Renderer" and "Background Blur"

link

danudey 590 days ago

I played around a bit lately with finding ways to dramatically multithread code in golang, mostly for fun. What I found was that there was a threshold where the overhead of spinning up all the threads at the start and synchroninzing them at the end overwhelmed the time savings from actually performing the work in multiple threads.

It wouldn't surprise me if PDF renderer and background blur were fast enough tasks that spinning up 96 threads to split rendering across all those cores was a waste of time compared to how fast the actual task was to complete. It was akin to trying to hammer in 50 nails by getting 50 people and handing out 50 hammers and assigning each person one nail, then telling them "okay, start!", then inspecting everyone's work afterwards; at some point, it's faster just to break it into two or three tasks.

link

dan353hehe 590 days ago

This was a surprise for me as well. I have two EPYC 9754 in a dual socket server, so 256 cores, and the test did not perform as well as I expected it too. It didn’t even load up all the cores, which is what I was needing to do.

I ended up using something else to generate the load I needed, but I can’t remember exactly what. I think it might have been a Monero benchmarking tool?

link

olliej 590 days ago

I read this a few times and just wanted to confirm - are you saying the benchmark itself is can’t handle the higher core count?

link

zamadatix 590 days ago

That's one way to view it. Another is the benchmark doesn't intend to measure the "peak multi-core CPU performance" the article assumes multi-core score is meant to measure. It's really measuring something more like mixed workload performance.

link

olliej 590 days ago

Oh interesting.

As a performance metric that does seem like it would be more valuable for lots of use cases, so measuring that seems good.

Maybe they need to add an additional “cpu bound multiprocessing perf”, and make it easier for professional tech reporters to understand complex concepts like benchmark numbers :D (in fairness to the reporters it does sound like the benchmark name legitimately implies that it’s a max parallel throughput benchmark, but if this is your job you should really know what your benchmarks are actually measuring).

Honestly a benchmark I would like to - which is more of a software/kernel/os/scheduling one - is “how responsive is this machine under heavy load”.

A non zero part of wanting that as part of a benchmark is that popular benchmarks often seem to be the only way to get companies to fix “uncommon” issues.

link

pier25 590 days ago

Cinebench is better to measure raw performance

link