|
|
|
|
|
by famouswaffles
1022 days ago
|
|
>There’s probably quite a lot of complexity on the backend in sourcing the right model for the right query. This isn't how Sparse MoE models work. There isn't really any complexity like that. And different models will or can pick each token. Sparse models aren't an ensemble of models. |
|