|
|
|
|
|
by lambda
91 days ago
|
|
Yeah, I looked up some models I have actually run locally on my Strix Halo laptop, and its saying I should have much lower performance than I actually have on models I've tested. For MoE models, it should be using the active parameters in memory bandwidth computation, not the total parameters. |
|