|
|
|
|
|
by leodavi
63 days ago
|
|
Probably not. The active parameter set may change from token to token, based on my understanding of MoE, so you'd be streaming (at the worst case, unlikely for a real scenario but frames the problem) 49B parameters from SSD for every output token... |
|