Hacker News new | ask | show | jobs
by dpoloncsak 107 days ago
>Apple is sitting out the AI race

Then why does my M4 run models at TOK/s that similar priced GPUs cannot?

2 comments

From TFA:

  For Private Cloud Compute specifically, the system is described as underpowered and perhaps more trouble than it’s worth. Updating the software is apparently trickier and takes time, and more fundamentally the chips (believed to comprise right now of modified M2 Ultra processors) are not powerful enough to run the latest frontier models like Gemini, which the new Siri will be based on.
> M2 Ultra processors ... are not powerful enough to run the latest frontier models

The local AI community would strongly disagree with that assessment. They may not be able to run them with low latency for interactive use and this is most likely the real blocker for them, but they will have strong compute per watt compared to nVidia GPU's.

You cropped the part of the quote that is relevant:

> like Gemini, which the new Siri will be based on.

The local AI community isn't evaluating the internal Gemini models. Apple's Private Compute hardware is specifically competing against Google's TPU hardware, which is a foregone conclusion if you've seen the inference economics. The money and electricity wasted on Mac inference at that scale isn't even attractive to Apple.

iPhones can run Uber app but nobody would claim Apple is in the ride sharing business.
No, but they are in the "Device that runs apps" business right? Just like they're looking to corner the "Device that runs models locally" business by focusing on onboard inference.

Gains in model performance isn't exactly cheap, and once one frontier model figures it out, the rest seem to copy it quick. Let them figure out what works and what doesnt, then put the "Apple" touch on it, all while putting your devices in everyone's hands. That's been their business model for years.