Hacker News new | ask | show | jobs
by hajile 732 days ago
They don't need the entire mac. Their cost per Max chip is probably $200-300 which beats the 4090 by a massive margin and each chip can do more than a 4090 because it also has a CPU onboard.

4090 peaks out at around 550w which means they can run 5+ of their Max chips in the same power budget.

A 4090 is $2000. Apple can probably get 5 chips on a custom motherboard for that cost. They'll use the same amount of power, but get a lot more raw compute.

2 comments

> Their cost per Max chip is probably $200-300 which beats the 4090 by a massive margin...

That's true. I was talking about end user pricing.

> ...each chip can do more than a 4090 because it also has a CPU onboard.

That's a strange thing to say. It has a CPU, correct. It makes the chip more versatile but for data center ML tasks it doesn't really matter. A 4090 chip also has much more ML relevant compute per chip. So apple's chips can't really "do more than a 4090" in any relevant way.

Of course apple pays less for their in house made chips vs external products. That comparison doesn't seem relevant to the context, e.g. they're not going to challenging CUDA with internal chips.

They might get more compute per watt though. My guess is that nvidias datacenter chips are competitive in that space, but that's another story.

The GPU in the M-series is much slower than a 4090. 4060-4070ish performance at best, and it varies quite a bit.
You need to consider this in the context of the relevant task. Nvidia GPUs have extremely high peak performance for GEMM, but when working with LLMs, bandwidth (and RAM capacity) becomes the limiting factor. There is a reason why real ML-focused datacenter Nvidia GPUs use much wider RAM interfaces and a much higher price point. The M2 Ultra might not have the raw compute, but it has a lot of RAM and large caches.
If they can get 5 4070s for the price and power of one 4090, that's a win for them as they'll get more performance per dollar and per watt.
> and per watt

Part of the advantage of using "one 4090" is that the max TDP is only 450w, as opposed to 5 M2 Ultras running at ~150w each. When you scale up to Nvidia's latest Blackwell architecture, I genuinely don't know how Apple could beat them on performance-per-watt. Buying M2 Ultras wholesale is probably cheaper than an NVL72 cluster, but certainly not what you'd want to use for Linux or maximizing AI-based performance-per-watt.

You are missing the point. We're discussing if Apple can use their own chips more cheaply than buying Nvidia's chips.

The Max TDP is not actual peak power consumption. Gamer's nexus recorded 500w peak and almost 670w overclocked. Most reviews I've looked at seem to put peak power consumption around 550w.

M2 Ultra wasn't even mentioned and it uses more than 150w. The correct question would be about M3 Max as we have solid numbers on it. M3 Max uses around 100w when both the GPU and CPU are heavily utilized and less than that when only the GPU is used.

This means that Apple could run 5 of their M3 Max chips in the same peak power as the 4090. But wait, there's more. 4090 doesn't run in a vacuum. It requires a separate CPU setup and a couple hundred more watts.

That means we could power 7 or so M3 Max chips with that same amount of power.

Of course, this isn't the whole story. 4090 isn't a professional chip either (while Apple can bin and certify their own CPUs and know they're getting a server-grade chip) and the 4090 also doesn't have nearly enough RAM. H100 starts at $25,000 and goes up. Apple could buy 75-100 M3 Max chips for that kind of money. That's certainly a load more compute than H100 would offer. Blackwell will be even more expensive in comparison.