|
|
|
|
|
by Aurornis
83 days ago
|
|
> Though iPhone Pro has very limited RAM (12GB total) which you still need for the active part of the model. This is why mixture of experts (MoE) models are favored for these demos: Only a portion of the weights are active for each token. |
|