Hacker News new | ask | show | jobs
by bbatsell 733 days ago
They say user data remains in the Secure Enclave at all times, which Nvidia GPUs would not be able to access. I am quite certain that their private cloud inference runs only Apple silicon chips. (The pre-WWDC rumors were that they built custom clusters using M2 Ultras.)
2 comments

> They say user data remains in the Secure Enclave at all times

No they don't. They say that the Secure Enclave participates in the secure boot chain, and in generating non-exportable keys used for secured transport. It reads to me as though user devices will encrypt requests to the keys held in the Secure Enclave of a subset of PCC nodes. A PCC node that receives the encrypted request will use the Secure Enclave to decrypt the payload. At that point, the general-purpose Application Processor in the PCC node has a cleartext copy of the user request for doing the needful inference, which _could_ be done on an NVidia GPU, but appears to be done on general-purpose Apple Silicon.

There is no suggestion that the user request is processed entirely within the Secure Enclave. The Secure Enclave is a cryptographic coprocessor. It almost certainly doesn't have the grunt to do inference.

Not that it matters anyways, since Apple refuses to sign Nvidia GPU drivers for MacOS in the first place. So if they own any Nvidia hardware themselves, then they also own more third-party hardware to support it.
Maybe this is way too science fiction, but what are the chances Apple's GPU/AI engine designs on Apple Silicon were a testbed for full sized, dedicated GPU dies that could compete with Nvidia's power in their own data centers?
Very low? I guess anything is possible, but the M1 through M4 GPUs weren't really anything to write home about. It more closely resembles AMD's raster-focused GPU compute in my opinion, which is certainly not a bad thing for mobile hardware.

Nvidia's GPUs are complex. They have a lot of dedicated, multipurpose acceleration hardware inside of them, and then they use CUDA to tie all those pieces together. Apple's GPUs are kinda the opposite way; they're extremely simple and optimized for low-power raster compute. Which isn't bad at all, for mobile! It just gimps them design-wise when they go up against purpose-built accelerators.

If we see Apple do custom Apple Silicon for the datacenter, it will be a pretty radically new design. The first thing they need is good networking; a full-size Nvidia cluster will use Mellanox Infiniband to connect dozens of servers at Tb/s speeds. So Apple would need a similar connectivity solution, at least to compete. The GPU would need to be bigger and probably higher-wattage, and the CPU should really emphasize core count over single-threaded performance. If they play their cards right there, they would have an Apple Silicon competitor to the Grace superchip and GB200 GPU.