Hacker News new | ask | show | jobs
by icosahedron 1710 days ago
I'm not exactly proficient with GeekBenchery, but what I see here is that the M1 Max per core barely outperforms the M1?

https://browser.geekbench.com/v5/cpu/compare/10496766?baseli...

5 comments

I think this kinda makes sense to me — the M1 Max has the same cores as the M1, just more of them and more of the performant ones, if I understand it right. The fastest work on the fastest core, when only working on a single core, is probably very similar.
Maybe a little surprised - presumably the thermal limitations on a 16 inch laptop are potentially less limiting than on a 13 inch one so that single core could be pushed to a higher frequency?
M1 uses TSMC high-density rather than high-performance. They get 40-60% better transistor density and less leakage (power consumption) at the expense of lower clockspeeds.

Also, a core is not necessarily just limited by power. There are often other considerations like pipeline length that affect final target clocks.

The fact is that at 3.2GHz, the M1 is very close to a 5800X in single-core performance. When that 5800X cranks up 8 cores, it dramatically slows down the clocks. Meanwhile the M1 should keep its max clockspeeds without any issue.

We know this because you can keep the 8 core M1 at max clocks for TEN MINUTES on passive cooling in the Macbook air (you can keep max clocks indefinitely if you apply a little thermal pad on the inside of the case).

> When that 5800X cranks up 8 cores, it dramatically slows down the clocks

Not that dramatic, it drops from ~4.8ghz to ~4.4ghz: https://www.anandtech.com/show/16214/amd-zen-3-ryzen-deep-di...

Actual drop varying depending on actual power consumption & temperature as Ryzen is more or less an entirely reactive system.

Thanks - some very good points. Presumably this opens the possibility of higher single core performance on a future desktop design unless limited by pipeline length etc?
> We know this because you can keep the 8 core M1 at max clocks for TEN MINUTES on passive cooling in the Macbook air (you can keep max clocks indefinitely if you apply a little thermal pad on the inside of the case).

Is there a guide on how to apply this thermal pad?

I would love for my Air to not downclock.

Also why on earth does this thermal pad not come factory installed?

That’s why the MacBook Pro M1 has fans. It’s designed for harder workloads where you’re maxing out the cores (multi-core compilation, video encoding, etc) for extended periods. This was well-documented and discussed previously. The tradeoff is increased heat, power consumption, and fan noise.

Realistically, those aren’t typical workloads for most people, especially for 10+ minutes (and especially on an ultralight and portable laptop). So I wouldn’t lose sleep over thermal pads.

Maybe this is a reflection of the 16 inches higher thermal envelope?

https://www.macrumors.com/2021/10/21/new-macbook-pros-high-p...

> M1 uses TSMC high-density rather than high-performance.

Is that an actual product differentiation from TSMC? Or just observational + the fact that it's 5nm.

I'm actually curious from a chip manufacturing standpoint.

There’s actually three major version on N3 (sorry, I don’t have time to look up the similar articles on N5/7, but I think wikichip has a couple).

https://www.techdesignforums.com/blog/2021/06/03/three-libra...

I remember seeing a picture of a partner list for a TSMC node with a couple dozen libraries that not only changed based on density, but also on type of chip being built.

Does the core speed of an M1 core change at all? I thought they used the low/high power cores at fixed-limit clock speeds.

It sounds crazy to consider, but maybe they’d rather not try to speed up the individual cores in M1 Max, so that they can keep their overhead competitively low. That certainly would simplify manufacturing and QA; removing an entire vector (clock speed) from the binning process makes pipelines and platforms easier to maintain.

For how long? Longer than it takes to run the benchmark?
I thought I remembered that in the presentation they had souped up the individual cores too. Must be I'm misremembering.
They didn't. However the cores enjoy more main memory bandwidth.
they're the same cores as the original M1 architecturally, you just get an 8+2 on a Max instead of 4+4... and also an absolutely titanic iGPU, total transistor count is about twice that of a 3090 and almost all of it lives in the iGPU.

the only architectural difference between M1 and M1 Max in the CPU cores, besides the different combination of big/little cores, is that the M1 Max goes from quad channel to hexadecimal-channel DDR5 RAM (note a DDR5 channel is half the width of a DDR4 channel, but has longer burst and higher MT/s).

Apple's iGPU approach is real simple: unlike a console where you somewhat gimp the CPU by putting it on high-latency GDDR6 (to get enough bandwidth to feed the iGPU), they just put 16 channels of DDR5 on it, basically it's like an octochannel DDR4 server processor in terms of bandwidth. And the CPU also benefits from that as well, at least a little bit, but internally it's the same core design.

It's an absolute meme of a design, Apple just does not give a single fuck about cost here, "server class memory configuration"? sure why not, and we'll stack it on the package so it can go in a laptop.

I believe it's LPDDR5, basically a memory chip for smartphones, not DDR5.
sure, but it's also running at JEDEC either way, it's actually irrelevant to performance, it's just a matter of packaging (obviously for laptops, stacking the packages is more convenient than DIMMs or soldered modules).
All I wanted was an M1 that could address/use more memory. The first one could only use 16gb RAM, which was a dealbreaker for me.

Apple delivered that and so much more. I'm happy. I think there are many people like me.

Multi-display is another highly sought feature this M1 Pro/Max delivers (achievable on the M1 with a DisplayLink hub, but stability leaves something to be desired)
Tick - New cores; Tock - Scaling up the number of those same cores

I think most of the work went into the uncore portions of the SOC this time.

FYI it's the reverse. "Tock" is a new microarchitecture, and "tick" is a process shrink.

https://en.wikipedia.org/wiki/Tick–tock_model

Also, scaling the number of cores up and down for different tasks happens in both. Apple's recent announcement doesn't fit the tick-tock model at all.
In this case, both are using the same CPU core designs and are on the same node.

However, they are following the notion of "don't change everything at once".

uncore?
Parts of the SoC that are not the main CPUs e.g. power management controllers, display controllers, etc.
The A15 chips has core improvements, I suspect this is what we’ll see every year from now on 15-20% performance increase yearly for the next few years assuming no issues with TSMC…