Unless you also predict that Apple will release datacenter systems a-la Grace and Instict, I don't think they're even in the runnings. AMD is only competitive in the LLM market because they sell extremely cheap and fast compute hardware at the same scale Nvidia does. As of today, Apple doesn't sell any hardware that can go toe-to-toe with a DGX system. They also have a lot of software problems (VM limitations, poor GPU API support, limited integration with open-source, etc.) that would need to be fixed for parity with Nvidia or AMD.
Apple will definitely push for on-device AI, but even in 2030 I firmly believe that they won't be leading the industry in performance. I'd be surprised if they even supported anything other than their proprietary CoreML by then.
Ditto on this. I want to not buy an A100 for $20k, or even consumer GPUs, but the truth is that for LLM inference, to run large models like LLaMa2 70b with INT4 quantization so it could fit
A100: 1248 TOPS
MI250: 362.1 TOPS
M3 Max: 18 TOPS
Yes, 18. Unless Apple has accelerated INT4 workloads but just forgot to document it.
Honestly, I’m an Apple fan, but when they go on stage and say “AI” they mean it can do speech recognition or tell a dog apart from a cat, or autofocus a camera. It can’t run ChatGPT-like things by a loooong mile.
Apple will definitely push for on-device AI, but even in 2030 I firmly believe that they won't be leading the industry in performance. I'd be surprised if they even supported anything other than their proprietary CoreML by then.