|
|
|
|
|
by declaredapple
806 days ago
|
|
They've been designing their own chips a while now, including with an NPU. Also because of their unified memory design, they actually have insane bandwidth which is incredibly useful for LLMs. IMO they may have a head-start in that respect for on-device inference of large models (e.g. 1B+ params). |
|
The bigger bottleneck seems like memory, to me. iPhones have traditionally skimped on RAM moreso than even cheap and midrange Android counterparts. I can imagine running an LLM in the background on my S10 - it's a bit harder to envision iOS swapping everything smoothly on a similarly-aged iPhone.