It's because x86 chips are no longer leading in the client. ARM chips are. Specifically, Apple chips. Though Qualcomm has huge potential leapfrog AMD/Intel chips in a few generations too.
AMD Ryzen 9 7950X (16 core) 560.8
Apple M2 Ultra (24 cores) 501.82
Apple M3 Max (12 cores) 408.27
Apple M3 Pro 226.46
Apple M3 160.58
It depends on what you're doing.
I'm a software developer using a compiler that 100%s all cores. I like fast multicore.
Apple Mac Pro, 64gb, M2 Ultra, $7000
Apple Mac mini, 32gb, M2 Pro, 2TB SSD, $2600
[Edit2] Compare to: 7950x is $500 and a very fast SSD is $400, fast 64gb is $200, very good board is $400 so I get a very fast dev machine for ~$1700 (0,329 p/$ vs. mini 0,077 p/$)
Though Blender may have an optimization for avx512 but not for SME or Neon.
But the vast majority will use GPUs to do rendering for Blender.
Try SPEC or its close consumer counterpart, Geekbench.
As an anecdote, all my Python and Node.js applications run faster on Apple Silicon than Zen4. Even my multithread Go apps seem to run better on Apple Silicon.
Now you're just asserting things unencumbered by even the slightest evidence.
On Passmark Apple CPUs are pretty far down the list.
On Geekbench I gave up after scrolling a few pages.
And "run faster on Apple Silicon than Zen4" means nothing. On the low end you have fairly cheap Ryzen 3 laptop chips, and on the high end you have Threadripper behemoths.
The problem with Geekbench is it's trying to average the scores from many different benchmarks, but then if some of them are outliers (e.g. one CPU has hardware acceleration or some other unusual aptitude for that specific workload), it gets an outsized score which is then averaged in and skews the result even if it doesn't generalize.
What you want to do is look at the benchmarks for the thing you're actually using it for.
> they're 2-4x more power efficient as well.
This is generally untrue, people come to this conclusion by comparing mobile CPUs with desktop CPUs. CPU power consumption is non-linear with performance, so a large power budget lets you eek out a tiny bit more margin. For example, compare the 65W 5700X with the 105W 5800X. The 40 extra watts buys you around 2% more single thread performance, not because the 5700X has a more efficient design -- they're the exact same CPU with a different power cap. It's because turning up the clock speed a tiny bit uses a lot more power, but desktop CPUs do it anyway, because they don't have any such thing as battery life and people want the extra tiny bit more. Or the CPU simply won't clock any higher and doesn't even hit the rated TDP on single-threaded workloads.
The extra power will buy you a lot more on multi-threaded workloads, because then you get linear performance improvement with more power by adding more cores. But that's where the high core count CPUs will mop the floor with everything else -- while achieving higher performance per watt, because the individual cores are clocked lower and use less power.
The problem with Geekbench is it's trying to average the scores from many different benchmarks, but then if some of them are outliers (e.g. one CPU has hardware acceleration or some other unusual aptitude for that specific workload), it gets an outsized score which is then averaged in and skews the result even if it doesn't generalize.
Geekbench CPU benchmark does not optimize for accelerators. It optimizes for instruction sets only.
The max power on Intel/AMD CPUs is only there to get the CPU "performance crown". As you've said, you spend a large amount of additional power for very minor gains (to be at the top of fancy Youtube review charts).
It looks though as if AMD/Intel feel threatened by Snapdragon though - we'll see what AMD Strix / Halo brings for the first meaningful x86 mobile processor in years (or Luna Lake).
Apple's M3 Max CPU cores peak out at 55w for 16 cores (12p+4e). AMD's 8-core U-series chips peak out at 65-70w on CPU-only workloads and still loses out massively in pretty much every category.
If you downclock that AMD chip, it does get more efficient, but also loses by even larger margins.
"On Geekbench I gave up after scrolling a few pages."
SPEC doesn't seem to have easily browsable results, but we can find the Cinebench 2024 ones easy and guess what? Apple isn't at the top. Not even close: https://www.cgdirector.com/cinebench-2024-scores/
Though what I would love is compile-time vs. $ (as mentioned, I'm a software developer). The 7950x is $500 and a very fast SSD is $400, fast 64gb is $200, very good board is $400 so I get a very fast dev machine for ~$1700.
Note that the M3 Max leads in ST in Cinebench 2024 and 2-3x better in perf/watt. It does lose in MT in Cinebench 2024 but wins in GB6 MT.
Cinebench is usually x86 favored as it favors AVX over NEON as well as having extremely long dependency chains, bottlenecked by caches and partly memory. This is why you get a huge SMT yield from it and why it scales very highly if you throw lots of "weak" cores at it.
This is why Cinebench is a poor CPU benchmark in general as the vast majority of applications do not behave like Cinebench.
Geekbench and SPEC are more predictive of CPU speed.
"But the vast majority will use GPUs to do rendering for Blender."
And the argument is, you can't use Blender to compare CPU performance because of that?
"Even my multithread Go apps seem to run better on Apple Silicon."
As a Go developer, I'd love to hear your story: How much faster does your Apple Silicon compile compare to a Zen4 (e.g. the 7950x?)? For example 100k lines of Go code.
I might switch back to Apple again (used Apple for 20+ years), if it's faster at compilation speed.
M4 is looking pretty interesting. Near 10% IPC uplift and they bumped the e-core on the base M4, so we're probably looking at the same 12 p-cores for the M4 max, but likely going from 4 to 12 e-cores (two of the 6-core complexes).
In multithreaded workloads, 2 of their current e-cores are roughly equivalent to 1 p-core, so that would represent the equivalent of 4 extra p-cores.
> How much faster does your Apple Silicon compile compare to a Zen4 (e.g. the 7950x?)?
Good ol, compare a $400 piece of equipment with a $3000 piece of equipment. I wonder what will win. (unironically, most of the time, the $3000 piece of equipment doesnt win)
On Geekbench, though they are segregated in separate pages so I'm not sure if the comparison is fully correct, the M2Ultra is behind the top 3 PC processors (2 Intel and 1 AMD) for multi-core, and it is indeed the best at single core.
Would that be comparing to Windows on Zen4 or to Linux on Zen4? On Windows I've noted that especially forking performance takes a big hit which causes many dynamic languages that do stuff with say invoking a runtime binary being 100s of times slower on windows (tried with bash and python).
There's something wrong with your M3 Max stuff. I believe it comes in 14 and 16-core variants while the M3 Pro comes in 11 and 12-core variants.
In any case, M3 Max uses less than 55w of power in CPU-only workloads while a desktop 7950x peaked out at 332w of power according to Guru3D (without an OC).
The fact that M2 Ultra hits so close while peaking out at only around 100w of CPU power is pretty crazy (M2 Ultra doesn't even hit 300w with all CPU and GPU cores maxed out).
Maybe use a benchmark that actually makes sense for CPUs, rather than something that's always much faster on a GPU (eg. M3 Pro as any sane user would use it for Blender is 2.7x the performance of a Ryzen 7950X, not 0.4x).
> Apple Mac mini, 32gb, M2 Ultra, 2TB SSD, $2600
Not a real thing. You meant M2 Pro, because the Max and Ultra chips aren't available in the Mac mini.
I would love to quote a compiler comparison, but I don't know a good and accepted compiler benchmark. What would you use as a compiler benchmark? (Preferably Go, but I assume Rust would be better, as it is much slower, so the differences are bigger)
The ST advantage of Apple Silicon is real. 7950x does do better in highly parallel tasks.
To me, Apple Silicon is clearly leading clients over AMD/Intel. Hence, my original reason for why AMD's announcement isn't "exciting". Because Apple Silicon is so far ahead in client.
Of course, AMD can crank up the core via Epyc/Threadripper and Apple has no answer. For that, you'd need to look into ARM chips from Ampere/Amazon for a competitor.
Building clang using itself is a reasonable approximation to a compiler benchmark, speaking as someone who spends a depressing fraction of his life doing that over and over for permutations of the source code. That's somewhere in the five to ten minutes range on a decent single socket system.
Is blender 100% GPU now? Last time I used it, there were multiple renderers available, and it wasn’t a 100% win to switch to GPU. IIRC the cpu did better in ray tracing(?). This was a couple years ago though so things may have changed or I might not be recalling correctly.
I think GPU rendering was always faster as long as you had a supported GPU. Now that the Cycles renderer has support for all the major GPU APIs/vendors, the only reasons to render on the CPU are if you don't have a half-decent GPU, or if your scene doesn't fit in your GPU's memory. Neither of those are a concern on Apple systems.
At least on NVIDIA hardware, Blender can use the GPU's raytracing capabilities rather than just the general-purpose GPU compute capabilities. Which means it doesn't take a very expensive GPU at all to outperform high-end CPUs.
Until ARM has a proper UEFI support, it's not a practical desktop/server with a few notable exceptions (Mac, Raspberry Pi) and only because there's so much support from the respective vendors.
Thanks to ex-Apple Nuvia/Oryon ("Qualcomm Snapdragon X Elite"), Arm laptops will launch in the next few months from Microsoft, Dell, HP, Lenovo, Asus and other OEMs, with UEFI support for Windows and in-progress support for Linux, https://news.ycombinator.com/item?id=40422286
It's like I keep saying: the first Chinese manufacturer to churn out cheap SBCs with ServerReady support will make a killing as a true Pi killer. Anyone? Anyone? Pine64? Pine64?
We dont really need a Pi killer anymore. They've done a fine job of killing it themselves. Their community has shrunk massively due to low cost mini pc's being leaps and bounds better value than a Pi now. Their two fingers being put up at the hacker/tinkerer hobbyist market over the last few years combined with the IPO and shift to B2B has made it very clear where their priorities lie.
Unless you need the GPIO theres zero reason to overpay for a Pi 5 for example when you can pick up decent second hand mini pc's on ebay for a lower price.
Case in point, a couple of months ago I was able to nab two brand new still in box Dell Optiplex 3050's (Core i7 6700T 4 Cores, 2.8Ghz, 16GB RAM Win 10 hardware license, 256gb ssd, with mouse & keyboard) for £55 each delivered. The base 4gb model Pi 5 comes in at £80-£100 once you add power, storage and a case.
Sure, its not ARM but you're not likely to be doing anything that _needs_ ARM.
Thats about the only area the Pi is superior at this point. For the usecases where a zero is all you need its a no brainer, but many are using them as home servers and such. Even a basic wireless camera feed can be a struggle for the zero so it's usecases are certainly limited, but its power to performance is great for its price.
ARM needs more than just proper UEFI support: Microsoft needs to lift the secureboot restrictions on ARM.
x86: Microsoft requires that end-users are allowed to disable secure boot and control which keys are used.
arm: Microsoft requires that end-users are not allowed to disable secure boot
This isn't a hardware issue, but simply a policy issue that Microsoft could solve with a stroke of a pen, but since Microsoft is such a behemoth in the laptop space, their policies control the non-apple market.
You're misguided.
Apple has excellent Notebook CPUs. Apple has great IPC. But AMD and Intel have easily faster CPUs.
https://opendata.blender.org/benchmarks/query/?compute_type=...
Blender Benchmark
It depends on what you're doing.I'm a software developer using a compiler that 100%s all cores. I like fast multicore.
[Edit2] Compare to: 7950x is $500 and a very fast SSD is $400, fast 64gb is $200, very good board is $400 so I get a very fast dev machine for ~$1700 (0,329 p/$ vs. mini 0,077 p/$)[Edit] Made a c&p mistake, the mini has no ultra.