| HN Mirror

https://opendata.blender.org/benchmarks/query/?compute_type=...

[If you're a laptop user, scroll down the thread for laptop Rust compile times, M3 Pro looks great]

You're misguided.

Apple has excellent Notebook CPUs. Apple has great IPC. But AMD and Intel have easily faster CPUs.

Blender Benchmark

      AMD Ryzen 9 7950X (16 core)         560.8
      Apple M2 Ultra (24 cores)           501.82
      Apple M3 Max (12 cores)             408.27
      Apple M3 Pro                        226.46
      Apple M3                            160.58

It depends on what you're doing.

I'm a software developer using a compiler that 100%s all cores. I like fast multicore.

      Apple Mac Pro, 64gb, M2 Ultra, $7000
      Apple Mac mini, 32gb, M2 Pro, 2TB SSD, $2600

[Edit2] Compare to: 7950x is $500 and a very fast SSD is $400, fast 64gb is $200, very good board is $400 so I get a very fast dev machine for ~$1700 (0,329 p/$ vs. mini 0,077 p/$)

[Edit] Made a c&p mistake, the mini has no ultra.

That seems wrong.

Though Blender may have an optimization for avx512 but not for SME or Neon.

But the vast majority will use GPUs to do rendering for Blender.

Try SPEC or its close consumer counterpart, Geekbench.

As an anecdote, all my Python and Node.js applications run faster on Apple Silicon than Zen4. Even my multithread Go apps seem to run better on Apple Silicon.

arp242 743 days ago

Now you're just asserting things unencumbered by even the slightest evidence.

On Passmark Apple CPUs are pretty far down the list.

On Geekbench I gave up after scrolling a few pages.

And "run faster on Apple Silicon than Zen4" means nothing. On the low end you have fairly cheap Ryzen 3 laptop chips, and on the high end you have Threadripper behemoths.

Passmark is a pretty bad CPU benchmark. It generally has poor correlation.

I would stick to SPEC and Geekbench.

Even Cinebench 2024 isn't too bad nowadays though R23 was quite poor in correlation.

In general, not only are Apple Silicon CPUs faster than AMD consumer CPUs, but they're 2-4x more power efficient as well.

AnthonyMouse 743 days ago

The problem with Geekbench is it's trying to average the scores from many different benchmarks, but then if some of them are outliers (e.g. one CPU has hardware acceleration or some other unusual aptitude for that specific workload), it gets an outsized score which is then averaged in and skews the result even if it doesn't generalize.

What you want to do is look at the benchmarks for the thing you're actually using it for.

> they're 2-4x more power efficient as well.

This is generally untrue, people come to this conclusion by comparing mobile CPUs with desktop CPUs. CPU power consumption is non-linear with performance, so a large power budget lets you eek out a tiny bit more margin. For example, compare the 65W 5700X with the 105W 5800X. The 40 extra watts buys you around 2% more single thread performance, not because the 5700X has a more efficient design -- they're the exact same CPU with a different power cap. It's because turning up the clock speed a tiny bit uses a lot more power, but desktop CPUs do it anyway, because they don't have any such thing as battery life and people want the extra tiny bit more. Or the CPU simply won't clock any higher and doesn't even hit the rated TDP on single-threaded workloads.

The extra power will buy you a lot more on multi-threaded workloads, because then you get linear performance improvement with more power by adding more cores. But that's where the high core count CPUs will mop the floor with everything else -- while achieving higher performance per watt, because the individual cores are clocked lower and use less power.

  The problem with Geekbench is it's trying to average the scores from many different benchmarks, but then if some of them are outliers (e.g. one CPU has hardware acceleration or some other unusual aptitude for that specific workload), it gets an outsized score which is then averaged in and skews the result even if it doesn't generalize.

Geekbench CPU benchmark does not optimize for accelerators. It optimizes for instruction sets only.

The max power on Intel/AMD CPUs is only there to get the CPU "performance crown". As you've said, you spend a large amount of additional power for very minor gains (to be at the top of fancy Youtube review charts).

It looks though as if AMD/Intel feel threatened by Snapdragon though - we'll see what AMD Strix / Halo brings for the first meaningful x86 mobile processor in years (or Luna Lake).

hajile 743 days ago

Apple's M3 Max CPU cores peak out at 55w for 16 cores (12p+4e). AMD's 8-core U-series chips peak out at 65-70w on CPU-only workloads and still loses out massively in pretty much every category.

If you downclock that AMD chip, it does get more efficient, but also loses by even larger margins.

arp242 743 days ago

> I would stick to SPEC and Geekbench.

I will repeat:

"On Geekbench I gave up after scrolling a few pages."

SPEC doesn't seem to have easily browsable results, but we can find the Cinebench 2024 ones easy and guess what? Apple isn't at the top. Not even close: https://www.cgdirector.com/cinebench-2024-scores/

Geekbench has a seperate page for each "instruction set".

For Apple you need to go to https://browser.geekbench.com/mac-benchmarks

Then compare numbers by hand I assume.

Though what I would love is compile-time vs. $ (as mentioned, I'm a software developer). The 7950x is $500 and a very fast SSD is $400, fast 64gb is $200, very good board is $400 so I get a very fast dev machine for ~$1700.

Here's GB6: https://browser.geekbench.com/v6/cpu/compare/6339005?baselin...

Note: M3 Max is a 40w CPU maximum, while 7950x is a 230w CPU maximum. The stated 170w max is usually deceptive from AMD.

Source for 7950x power consumption: https://www.anandtech.com/show/17641/lighter-touch-cpu-power....

Note that the M3 Max leads in ST in Cinebench 2024 and 2-3x better in perf/watt. It does lose in MT in Cinebench 2024 but wins in GB6 MT.

Cinebench is usually x86 favored as it favors AVX over NEON as well as having extremely long dependency chains, bottlenecked by caches and partly memory. This is why you get a huge SMT yield from it and why it scales very highly if you throw lots of "weak" cores at it.

This is why Cinebench is a poor CPU benchmark in general as the vast majority of applications do not behave like Cinebench.

Geekbench and SPEC are more predictive of CPU speed.

"But the vast majority will use GPUs to do rendering for Blender."

And the argument is, you can't use Blender to compare CPU performance because of that?

"Even my multithread Go apps seem to run better on Apple Silicon."

As a Go developer, I'd love to hear your story: How much faster does your Apple Silicon compile compare to a Zen4 (e.g. the 7950x?)? For example 100k lines of Go code.

I might switch back to Apple again (used Apple for 20+ years), if it's faster at compilation speed.

hajile 743 days ago

M4 is looking pretty interesting. Near 10% IPC uplift and they bumped the e-core on the base M4, so we're probably looking at the same 12 p-cores for the M4 max, but likely going from 4 to 12 e-cores (two of the 6-core complexes).

In multithreaded workloads, 2 of their current e-cores are roughly equivalent to 1 p-core, so that would represent the equivalent of 4 extra p-cores.

resource_waste 743 days ago

> How much faster does your Apple Silicon compile compare to a Zen4 (e.g. the 7950x?)?

Good ol, compare a $400 piece of equipment with a $3000 piece of equipment. I wonder what will win. (unironically, most of the time, the $3000 piece of equipment doesnt win)

What is this $400 piece of equipment?

tsimionescu 743 days ago

On Geekbench, though they are segregated in separate pages so I'm not sure if the comparison is fully correct, the M2Ultra is behind the top 3 PC processors (2 Intel and 1 AMD) for multi-core, and it is indeed the best at single core.

Yes see the GB6 benchmarks for compilation

    M2 Ultra   233.9 Klines/sec
    7950x      230.3 Klines/sec  
    14900K     215.3 Klines/sec  
    M3 Max     196.5 Klines/sec

are nearly the same.

rubin55 743 days ago

Would that be comparing to Windows on Zen4 or to Linux on Zen4? On Windows I've noted that especially forking performance takes a big hit which causes many dynamic languages that do stuff with say invoking a runtime binary being 100s of times slower on windows (tried with bash and python).

hajile 743 days ago

There's something wrong with your M3 Max stuff. I believe it comes in 14 and 16-core variants while the M3 Pro comes in 11 and 12-core variants.

In any case, M3 Max uses less than 55w of power in CPU-only workloads while a desktop 7950x peaked out at 332w of power according to Guru3D (without an OC).

The fact that M2 Ultra hits so close while peaking out at only around 100w of CPU power is pretty crazy (M2 Ultra doesn't even hit 300w with all CPU and GPU cores maxed out).

Yes you're right the M3 Max has 14/16 and the M3 Pro 11/12 cores.

wtallis 743 days ago

> Blender Benchmark

Maybe use a benchmark that actually makes sense for CPUs, rather than something that's always much faster on a GPU (eg. M3 Pro as any sane user would use it for Blender is 2.7x the performance of a Ryzen 7950X, not 0.4x).

> Apple Mac mini, 32gb, M2 Ultra, 2TB SSD, $2600

Not a real thing. You meant M2 Pro, because the Max and Ultra chips aren't available in the Mac mini.

I would love to quote a compiler comparison, but I don't know a good and accepted compiler benchmark. What would you use as a compiler benchmark? (Preferably Go, but I assume Rust would be better, as it is much slower, so the differences are bigger)

(corrected my c&p mistake with the mini, thanks)

phonon 743 days ago

You can look at the Geekbench 6 component.

https://www.geekbench.com/doc/geekbench6-benchmark-internals... (page 18)

Thanks, Clang looks good, now I need to check how to sort CPUs/systems by the Clang benchmark, no success for now.

"Randomly" picking

    14900K     215.3 Klines/sec  
    7950x      230.3 Klines/sec  
    M2 Ultra   233.9 Klines/sec
    M3 Max     196.5 Klines/sec

Single thread:

M3 Max: 3898

7950x: 2951

The ST advantage of Apple Silicon is real. 7950x does do better in highly parallel tasks.

To me, Apple Silicon is clearly leading clients over AMD/Intel. Hence, my original reason for why AMD's announcement isn't "exciting". Because Apple Silicon is so far ahead in client.

Of course, AMD can crank up the core via Epyc/Threadripper and Apple has no answer. For that, you'd need to look into ARM chips from Ampere/Amazon for a competitor.

JonChesterfield 743 days ago

Building clang using itself is a reasonable approximation to a compiler benchmark, speaking as someone who spends a depressing fraction of his life doing that over and over for permutations of the source code. That's somewhere in the five to ten minutes range on a decent single socket system.

Do you know of a benchmark site that compares clang compilations for different systems (CPU/RAM/SSD)?

codewithcheese 743 days ago

Think chromium compile is widely used

Can you point me to a comparison site? Didn't find a M3/M2/7950/... comparison site for chromium compile times :-(

(Even phoronix is scares and mostly focuses on laptops - I have no laptop)

pquki4 743 days ago

There probably isn't a site that just comparess chromium compilation time, but you can find the number in many YouTube and text reviews.

bee_rider 743 days ago

Is blender 100% GPU now? Last time I used it, there were multiple renderers available, and it wasn’t a 100% win to switch to GPU. IIRC the cpu did better in ray tracing(?). This was a couple years ago though so things may have changed or I might not be recalling correctly.

wtallis 742 days ago

I think GPU rendering was always faster as long as you had a supported GPU. Now that the Cycles renderer has support for all the major GPU APIs/vendors, the only reasons to render on the CPU are if you don't have a half-decent GPU, or if your scene doesn't fit in your GPU's memory. Neither of those are a concern on Apple systems.

At least on NVIDIA hardware, Blender can use the GPU's raytracing capabilities rather than just the general-purpose GPU compute capabilities. Which means it doesn't take a very expensive GPU at all to outperform high-end CPUs.

krasin 743 days ago

Until ARM has a proper UEFI support, it's not a practical desktop/server with a few notable exceptions (Mac, Raspberry Pi) and only because there's so much support from the respective vendors.

I know that there's some work happening about UEFI+ARM (https://developer.arm.com/Architectures/Unified%20Extensible...), but its support is very rare. The only example I can recall is Ampere Altra: https://www.jeffgeerling.com/blog/2023/ampere-altra-max-wind...

walterbell 743 days ago

Thanks to ex-Apple Nuvia/Oryon ("Qualcomm Snapdragon X Elite"), Arm laptops will launch in the next few months from Microsoft, Dell, HP, Lenovo, Asus and other OEMs, with UEFI support for Windows and in-progress support for Linux, https://news.ycombinator.com/item?id=40422286

bitwize 743 days ago

It's like I keep saying: the first Chinese manufacturer to churn out cheap SBCs with ServerReady support will make a killing as a true Pi killer. Anyone? Anyone? Pine64? Pine64?

esskay 743 days ago

We dont really need a Pi killer anymore. They've done a fine job of killing it themselves. Their community has shrunk massively due to low cost mini pc's being leaps and bounds better value than a Pi now. Their two fingers being put up at the hacker/tinkerer hobbyist market over the last few years combined with the IPO and shift to B2B has made it very clear where their priorities lie.

Unless you need the GPIO theres zero reason to overpay for a Pi 5 for example when you can pick up decent second hand mini pc's on ebay for a lower price.

Case in point, a couple of months ago I was able to nab two brand new still in box Dell Optiplex 3050's (Core i7 6700T 4 Cores, 2.8Ghz, 16GB RAM Win 10 hardware license, 256gb ssd, with mouse & keyboard) for £55 each delivered. The base 4gb model Pi 5 comes in at £80-£100 once you add power, storage and a case.

Sure, its not ARM but you're not likely to be doing anything that _needs_ ARM.

viraptor 743 days ago

> Unless you need the GPIO theres zero reason to overpay for a Pi 5

Even then, both usb-to-gpio and mini PCs with gpio exist. Unless you want something really small, then there's still pi zero and Arduino

ZiiS 743 days ago

Try matching the size, power draw, and price of the Zero.

esskay 743 days ago

Thats about the only area the Pi is superior at this point. For the usecases where a zero is all you need its a no brainer, but many are using them as home servers and such. Even a basic wireless camera feed can be a struggle for the zero so it's usecases are certainly limited, but its power to performance is great for its price.

ZiiS 743 days ago

Yes, the Zero 2 W is well worth it for wireless streaming. With MediaMTX's WebRTC you get great quality and really low latency.

moooo99 743 days ago

The zero is a cool device and I found some uses for it, but realistically it’s not at all comparable to the raspberry Pi people (used to) love

ZiiS 743 days ago

Cheep, low power, same GPIO, same great software support? I rarely use none zero's now what is missing?

craftkiller 743 days ago

ARM needs more than just proper UEFI support: Microsoft needs to lift the secureboot restrictions on ARM.

x86: Microsoft requires that end-users are allowed to disable secure boot and control which keys are used.

arm: Microsoft requires that end-users are not allowed to disable secure boot

This isn't a hardware issue, but simply a policy issue that Microsoft could solve with a stroke of a pen, but since Microsoft is such a behemoth in the laptop space, their policies control the non-apple market.

source: https://mjg59.dreamwidth.org/23817.html