| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by icosahedron 1710 days ago
	I'm not exactly proficient with GeekBenchery, but what I see here is that the M1 Max per core barely outperforms the M1? https://browser.geekbench.com/v5/cpu/compare/10496766?baseli...

5 comments

bichiliad 1710 days ago

I think this kinda makes sense to me — the M1 Max has the same cores as the M1, just more of them and more of the performant ones, if I understand it right. The fastest work on the fastest core, when only working on a single core, is probably very similar.

klelatti 1710 days ago

Maybe a little surprised - presumably the thermal limitations on a 16 inch laptop are potentially less limiting than on a 13 inch one so that single core could be pushed to a higher frequency?

hajile 1709 days ago

M1 uses TSMC high-density rather than high-performance. They get 40-60% better transistor density and less leakage (power consumption) at the expense of lower clockspeeds.

Also, a core is not necessarily just limited by power. There are often other considerations like pipeline length that affect final target clocks.

The fact is that at 3.2GHz, the M1 is very close to a 5800X in single-core performance. When that 5800X cranks up 8 cores, it dramatically slows down the clocks. Meanwhile the M1 should keep its max clockspeeds without any issue.

We know this because you can keep the 8 core M1 at max clocks for TEN MINUTES on passive cooling in the Macbook air (you can keep max clocks indefinitely if you apply a little thermal pad on the inside of the case).

kllrnohj 1709 days ago

> When that 5800X cranks up 8 cores, it dramatically slows down the clocks

Not that dramatic, it drops from ~4.8ghz to ~4.4ghz: https://www.anandtech.com/show/16214/amd-zen-3-ryzen-deep-di...

Actual drop varying depending on actual power consumption & temperature as Ryzen is more or less an entirely reactive system.

klelatti 1709 days ago

Thanks - some very good points. Presumably this opens the possibility of higher single core performance on a future desktop design unless limited by pipeline length etc?

runeks 1707 days ago

> We know this because you can keep the 8 core M1 at max clocks for TEN MINUTES on passive cooling in the Macbook air (you can keep max clocks indefinitely if you apply a little thermal pad on the inside of the case).

Is there a guide on how to apply this thermal pad?

I would love for my Air to not downclock.

Also why on earth does this thermal pad not come factory installed?

qubitcoder 1707 days ago

That’s why the MacBook Pro M1 has fans. It’s designed for harder workloads where you’re maxing out the cores (multi-core compilation, video encoding, etc) for extended periods. This was well-documented and discussed previously. The tradeoff is increased heat, power consumption, and fan noise.

Realistically, those aren’t typical workloads for most people, especially for 10+ minutes (and especially on an ultralight and portable laptop). So I wouldn’t lose sleep over thermal pads.

klelatti 1708 days ago

Maybe this is a reflection of the 16 inches higher thermal envelope?

https://www.macrumors.com/2021/10/21/new-macbook-pros-high-p...

O_H_E 1709 days ago

> M1 uses TSMC high-density rather than high-performance.

Is that an actual product differentiation from TSMC? Or just observational + the fact that it's 5nm.

I'm actually curious from a chip manufacturing standpoint.

hajile 1709 days ago

There’s actually three major version on N3 (sorry, I don’t have time to look up the similar articles on N5/7, but I think wikichip has a couple).

https://www.techdesignforums.com/blog/2021/06/03/three-libra...

I remember seeing a picture of a partner list for a TSMC node with a couple dozen libraries that not only changed based on density, but also on type of chip being built.

floatingatoll 1709 days ago

Does the core speed of an M1 core change at all? I thought they used the low/high power cores at fixed-limit clock speeds.

It sounds crazy to consider, but maybe they’d rather not try to speed up the individual cores in M1 Max, so that they can keep their overhead competitively low. That certainly would simplify manufacturing and QA; removing an entire vector (clock speed) from the binning process makes pipelines and platforms easier to maintain.

tedunangst 1709 days ago

For how long? Longer than it takes to run the benchmark?

icosahedron 1710 days ago

I thought I remembered that in the presentation they had souped up the individual cores too. Must be I'm misremembering.

sliken 1710 days ago

They didn't. However the cores enjoy more main memory bandwidth.

paulmd 1709 days ago

they're the same cores as the original M1 architecturally, you just get an 8+2 on a Max instead of 4+4... and also an absolutely titanic iGPU, total transistor count is about twice that of a 3090 and almost all of it lives in the iGPU.

the only architectural difference between M1 and M1 Max in the CPU cores, besides the different combination of big/little cores, is that the M1 Max goes from quad channel to hexadecimal-channel DDR5 RAM (note a DDR5 channel is half the width of a DDR4 channel, but has longer burst and higher MT/s).

Apple's iGPU approach is real simple: unlike a console where you somewhat gimp the CPU by putting it on high-latency GDDR6 (to get enough bandwidth to feed the iGPU), they just put 16 channels of DDR5 on it, basically it's like an octochannel DDR4 server processor in terms of bandwidth. And the CPU also benefits from that as well, at least a little bit, but internally it's the same core design.

It's an absolute meme of a design, Apple just does not give a single fuck about cost here, "server class memory configuration"? sure why not, and we'll stack it on the package so it can go in a laptop.

vondro 1709 days ago

I believe it's LPDDR5, basically a memory chip for smartphones, not DDR5.

paulmd 1707 days ago

sure, but it's also running at JEDEC either way, it's actually irrelevant to performance, it's just a matter of packaging (obviously for laptops, stacking the packages is more convenient than DIMMs or soldered modules).

vmception 1709 days ago

All I wanted was an M1 that could address/use more memory. The first one could only use 16gb RAM, which was a dealbreaker for me.

Apple delivered that and so much more. I'm happy. I think there are many people like me.

bdcravens 1709 days ago

Multi-display is another highly sought feature this M1 Pro/Max delivers (achievable on the M1 with a DisplayLink hub, but stability leaves something to be desired)

GeekyBear 1709 days ago

Tick - New cores; Tock - Scaling up the number of those same cores

I think most of the work went into the uncore portions of the SOC this time.

als0 1709 days ago

FYI it's the reverse. "Tock" is a new microarchitecture, and "tick" is a process shrink.

https://en.wikipedia.org/wiki/Tick–tock_model

lern_too_spel 1709 days ago

Also, scaling the number of cores up and down for different tasks happens in both. Apple's recent announcement doesn't fit the tick-tock model at all.

GeekyBear 1709 days ago

In this case, both are using the same CPU core designs and are on the same node.

However, they are following the notion of "don't change everything at once".

sydthrowaway 1709 days ago

uncore?

als0 1709 days ago

Parts of the SoC that are not the main CPUs e.g. power management controllers, display controllers, etc.

andy_ppp 1709 days ago

The A15 chips has core improvements, I suspect this is what we’ll see every year from now on 15-20% performance increase yearly for the next few years assuming no issues with TSMC…