|
|
|
|
|
by zozbot234
83 days ago
|
|
A similar approach was recently featured here: https://news.ycombinator.com/item?id=47476422 Though iPhone Pro has very limited RAM (12GB total) which you still need for the active part of the model. (Unless you want to use Intel Optane wearout-resistant storage, but that was power hungry and thus unsuitable to a mobile device.) |
|
This is why mixture of experts (MoE) models are favored for these demos: Only a portion of the weights are active for each token.