Hacker News new | ask | show | jobs
by arp242 742 days ago
Now you're just asserting things unencumbered by even the slightest evidence.

On Passmark Apple CPUs are pretty far down the list.

On Geekbench I gave up after scrolling a few pages.

And "run faster on Apple Silicon than Zen4" means nothing. On the low end you have fairly cheap Ryzen 3 laptop chips, and on the high end you have Threadripper behemoths.

1 comments

Passmark is a pretty bad CPU benchmark. It generally has poor correlation.

I would stick to SPEC and Geekbench.

Even Cinebench 2024 isn't too bad nowadays though R23 was quite poor in correlation.

In general, not only are Apple Silicon CPUs faster than AMD consumer CPUs, but they're 2-4x more power efficient as well.

The problem with Geekbench is it's trying to average the scores from many different benchmarks, but then if some of them are outliers (e.g. one CPU has hardware acceleration or some other unusual aptitude for that specific workload), it gets an outsized score which is then averaged in and skews the result even if it doesn't generalize.

What you want to do is look at the benchmarks for the thing you're actually using it for.

> they're 2-4x more power efficient as well.

This is generally untrue, people come to this conclusion by comparing mobile CPUs with desktop CPUs. CPU power consumption is non-linear with performance, so a large power budget lets you eek out a tiny bit more margin. For example, compare the 65W 5700X with the 105W 5800X. The 40 extra watts buys you around 2% more single thread performance, not because the 5700X has a more efficient design -- they're the exact same CPU with a different power cap. It's because turning up the clock speed a tiny bit uses a lot more power, but desktop CPUs do it anyway, because they don't have any such thing as battery life and people want the extra tiny bit more. Or the CPU simply won't clock any higher and doesn't even hit the rated TDP on single-threaded workloads.

The extra power will buy you a lot more on multi-threaded workloads, because then you get linear performance improvement with more power by adding more cores. But that's where the high core count CPUs will mop the floor with everything else -- while achieving higher performance per watt, because the individual cores are clocked lower and use less power.

  The problem with Geekbench is it's trying to average the scores from many different benchmarks, but then if some of them are outliers (e.g. one CPU has hardware acceleration or some other unusual aptitude for that specific workload), it gets an outsized score which is then averaged in and skews the result even if it doesn't generalize.

Geekbench CPU benchmark does not optimize for accelerators. It optimizes for instruction sets only.
It's not just about coprocessors. If one CPU has a set of SIMD instructions that double performance on that benchmark or more, that creates a large outlier that significantly changes the average.

Apple Silicon also has more memory bandwidth the primary purpose of which is to feed the GPU because most CPU workloads don't care about that, but if you average in the occasional ones that does then you get more outliers.

Which is why the thing that matters is how it performs on the thing you actually want to run on it, not how it performs in aggregate on a bunch of other applications you don't use.

The max power on Intel/AMD CPUs is only there to get the CPU "performance crown". As you've said, you spend a large amount of additional power for very minor gains (to be at the top of fancy Youtube review charts).

It looks though as if AMD/Intel feel threatened by Snapdragon though - we'll see what AMD Strix / Halo brings for the first meaningful x86 mobile processor in years (or Luna Lake).

> The max power on Intel/AMD CPUs is only there to get the CPU "performance crown".

It's mostly not. Its real purpose is to improve performance on threaded workloads.

Multi-core CPUs work like this: At the max boost a single core might use, say, 50 watts. So if you have 8 cores and wanted to run them all full out, you'd need a 400 watt power budget, which is a little nuts. It's not even worth it. Because you only have to clock them a little lower, say 4GHz instead of 5, to cut the power consumption more than in half, and then you get a TDP of e.g. 100W. Still not nothing but much more reasonable. You can also cut the clock speed even more and get the power consumption all the way down to 15W, but then you're down to 2GHz on threaded workloads and sacrificing quite a bit of multi-thread performance.

So they're not just trying to eek out a couple of percent, even though that's all you get from single thread improvement, because a single core was already near or at its limit. Whereas 8 cores at 4GHz will be legitimately twice as fast as the same cores at 2GHz. But they'll also use more than twice as much power. Which matters in a laptop but not so much in a desktop.

Of course, the thing that works even better is to have 16 cores or more that are clocked a little lower, which improves performance and performance per watt. The performance per watt of the 96-core Threadrippers are astonishingly good -- even though they're 360W. But that also requires more silicon, so those ones are the expensive ones.

Apple's M3 Max CPU cores peak out at 55w for 16 cores (12p+4e). AMD's 8-core U-series chips peak out at 65-70w on CPU-only workloads and still loses out massively in pretty much every category.

If you downclock that AMD chip, it does get more efficient, but also loses by even larger margins.

Because you're comparing a 16-core CPU to an 8-core CPU on threaded workloads, which as mentioned is where the multi-threaded workloads will favor the one with more cores on both performance and performance per watt. But why not compare it to the Ryzen that also has 16 cores, like the 7945HX3D? Because then the Ryzen is generally faster on threaded workloads, even though the TDP is still 55W -- and even though it's on TSMC 5nm instead of 3nm.
Who said I was comparing multithreaded loads? AMD will hit that on just one core with those boost clocks.
The U series doesn't even hit that power consumption when all cores are in use -- its power consumption is between 10 and 30W. The higher power mobile chips are the H series, but even there the one that uses "65-70W" is the 16-core 7945HX3D, which is a 5nm/6nm chip you're comparing to Apple's 3nm one, and even then it's almost as fast. Most of even the H series uses less power than that, partially because some of them are 4nm but mostly because they have fewer cores.

But you can't really expect an older CPU on a previous generation process node with lower power consumption to be faster.

> I would stick to SPEC and Geekbench.

I will repeat:

"On Geekbench I gave up after scrolling a few pages."

SPEC doesn't seem to have easily browsable results, but we can find the Cinebench 2024 ones easy and guess what? Apple isn't at the top. Not even close: https://www.cgdirector.com/cinebench-2024-scores/

Geekbench has a seperate page for each "instruction set".

For Apple you need to go to https://browser.geekbench.com/mac-benchmarks

Then compare numbers by hand I assume.

Though what I would love is compile-time vs. $ (as mentioned, I'm a software developer). The 7950x is $500 and a very fast SSD is $400, fast 64gb is $200, very good board is $400 so I get a very fast dev machine for ~$1700.

I compiled a few previously. Sorry for the formatting:

ASUS ROG Zephyrus G16 (2024)

Processor: Intel Core Ultra 9 185

Memory: 32GB

Cargo Build: 31.85 seconds

Cargo Build --Release: 1 minute 4 seconds

ASUS ROG Zephyrus G14 (2024)

Processor: AMD Ryzen 8945HS / Radeon 780M

Memory: 32GB

Cargo Build: 29.48 seconds

Cargo Build --Release: 34.78 seconds

ASUS ROG Strix Scar 18 (2024)

Processor: Intel Core i9 14900HX

Memory: 64GB

Cargo Build: 21.27 seconds

Cargo Build --Release: 28.69 seconds

Apple MacBook Pro (M3 Pro 11 core)

Processor: M3 Pro 11 core

Cargo Build: 13.70 seconds

Cargo Build --Release: 21.65 seconds

Apple MacBook Pro 16 (M3 Max)

Processor: M3 Max

Cargo Build: 12.70 seconds

Cargo Build --Release: 15.90 seconds

Firefox Mobile build:

M1 Air: 95 seconds AMD 5900hx: 138 seconds Source: https://youtu.be/QSPFx9R99-o?si=oG_nuV4oiMxjv4F-&t=505

Javascript builds

Here, Alex compares the M1 Air running Parallels emulating Linux vs native Linux on AMD Zen2 mobile. The M1 is still significantly faster. https://youtu.be/tgS1P5bP7dA?si=Xz2JQmgoYp3IQGCX&t=183

Docker builds

Here, Alex runs Docker ARM64 vs AMD x86 images and the M1 Air built the image 2x faster than an AMD Zen2 mobile. https://youtu.be/sWav0WuNMNs?si=IgxeMoJqpQaZv2nc&t=366

Anyways, Alex has a ton more videos on coding performance between Apple, Intel and AMD.

Lastly, this is not M1 vs Zen2 but it's M2 vs Zen4.

LLVM build test

M2 Max: 377 seconds Ryzen 9 7940S: 826 seconds

@aurareturn really appreciate the comparison and your effort (upvoted). As I no longer use a laptop (don't need it, too expensive, breaks, no upgrades, but that is just me), browsing through that channel looks like he focuses on laptops.

Would love to see a 7950x/64gb/SSD5 comparison, perhaps (see https://www.octobench.com/ for SSD impact on Go compilation) he will create one in the future (channel bookmarked). But would I still need to use a laptop, I would probably switch back to Apple (have an iMac Pro as decoration standing in the shelf, was my last Apple dev machine).

The $5000 16 Pro looks great as a machine. When still working at eBay, the nice thing was one always got the max specced machine as a developer back in the days - so that would probably be it. Real nice one.

[Edit]

Someone suggested looking at Geekbench Clang, which brought some insights for my desktop usage:

(it looks like top CPUs are more or less the same, ~15% difference)

"Randomly" picking

    M2 Ultra   233.9 Klines/sec
    7950x      230.3 Klines/sec  
    14900K     215.3 Klines/sec  
    M3 Max     196.5 Klines/sec
Ah right; that's confusing! Seems that the AMD and Intel chips are much faster though, consistent with other benchmarks.

Speed vs. $ is of course a different story than pure speed; kinda hard to capture in a number I guess.

Most Go projects compile more than fast enough even on my 7 year old i5, although there are exceptions (mostly crummy hyper-overengineered projects).

I think it depends on what you do. If you write a database in Go, there is no problem with 5min compile time. If you write a web app, 10 sec compile times are already annoying.
5 minute compile time is annoying no matter what project you are trying to compile. Thankfully most every golang package I compile, takes less than 2-3 seconds.
"go test" compiles your package. Would not want a 5 minute feedback loop!
Here's GB6: https://browser.geekbench.com/v6/cpu/compare/6339005?baselin...

Note: M3 Max is a 40w CPU maximum, while 7950x is a 230w CPU maximum. The stated 170w max is usually deceptive from AMD.

Source for 7950x power consumption: https://www.anandtech.com/show/17641/lighter-touch-cpu-power....

Note that the M3 Max leads in ST in Cinebench 2024 and 2-3x better in perf/watt. It does lose in MT in Cinebench 2024 but wins in GB6 MT.

Cinebench is usually x86 favored as it favors AVX over NEON as well as having extremely long dependency chains, bottlenecked by caches and partly memory. This is why you get a huge SMT yield from it and why it scales very highly if you throw lots of "weak" cores at it.

This is why Cinebench is a poor CPU benchmark in general as the vast majority of applications do not behave like Cinebench.

Geekbench and SPEC are more predictive of CPU speed.

It the end, what matters is real-world performance and different workloads have different bottlenecks. For people who use Cinema 4D, Cinebench is the most accurate measurement of hardware capabilities they can get. It's very hard to generalize what will matter for the vast majority of people. I find it's best to look at benchmarks for the same applications or similar workloads to what you'll be doing. Single score benchmark like Geekbench are fun and quick way to get some general idea about CPU capabilities, but most of the time they don't match specifics of real-world workloads.

Here's a content creation benchmark (note that for some tasks a GPU is also used):

https://www.pugetsystems.com/labs/articles/mac-vs-pc-for-con...

Cinebench is being used like a general purpose CPU benchmark when, like you said, it should only be used to judge the performance of Cinema 4D. Cinema 4D is a niche software in a niche. Why is a niche in a niche software application being used to judge overall CPU performance? It doesn't make sense.

Meanwhile, Geekbench does run real world workloads using real world libraries. You can look at subtest scores to to see "real world" results.

Pugetsystem benchmarks are pretty good. It shows how Apple SoCs punch above their weight in real world applications over benchmarks.

Regardless, they are comparing desktop machines using as much as 1500 watts vs a laptop that maxes out at 80 watts and Apple is still competing well. The wins in the PC world are usually due to beefy Nvidia GPUs that are applications have historically optimized for.

That's why I originally said ARM is leading AMD - specifically Apple ARM chips.

Some Geekbench 6 workloads and libraries:

- Dijkstra's algorithm: not used by vast majority of applications.

- Google Gumbo: unmaintained since 2016.

- litehtml: not used by any major browser.

- Clang: common on HN, but niche for general population.

- 3D texture encoding: very niche.

- Ray Tracer: a custom ray tracer using Intel Embree lib. That's worse than Cinebench.

- Structure from Motion: generates 3D geometry from multiple 2D images.

It also uses some more commonly used libraries, but there's enough niche stuff in Geekbench that I can't say it's a good representation of a real world workloads.

> Regardless, they are comparing desktop machines using as much as 1500 watts vs a laptop that maxes out at 80 watts and Apple is still competing well. The wins in the PC world are usually due to beefy Nvidia GPUs that are applications have historically optimized for.

They included a laptop, which is also competing rather well with Apple offerings. And it's not PC's fault you can't add a custom GPU to Apple offerings.