I think the most important and somewhat massive news is that Apple built an entire LLM private cloud using Apple Silicon - that's a big deal/investment.
Almost makes you wonder if they're inferencing models on that hardware but not training it there. They worded that part of the presentation very carefully.
Which also begs the question; how much money would Apple save by using Nvidia for everything? Probably not much since they don't have to pay margins on Apple Silicon bought for themselves. But I suspect there is a literal monetary cost to bruteforcing an Nvidia-scale server network with weaker hardware.
Yeah I’d guess inference, training relies much more on high speed interconnect between nodes, which I’m sure they could do but it’s certainly another step up in complexity.
They said the models can scale to "private cloud compute" based on Apple Silicon which will be ensured by your device to run "publicly verifiable software" in order to guarantee no misuse of your data.
I wonder if their server-side code will be open-source? That'd be positively surprising. Curious to see how this evolves.
Anyway, overall looks really really cool. If it works as marketed, then it will be an easy "shut up and take my money".
They need a lot more information about how "Apple Intelligence" actually works. They can't just wave the magic wand and say "some" of the functionality runs in the "private" cloud - we need one single checkbox to say we want the cloud or not and strong guarantees things don't accidentally get cloud-ified.