|
|
|
|
|
by mirekrusin
274 days ago
|
|
MoE is something different - it's a technique to activate just a small subset of parameters during inference. Whatever is good enough now, can be much better for the same cost (time, computation, actual cost). People will always choose better over worse. |
|