Y
Hacker News
new
|
ask
|
show
|
jobs
by
anilgulecha
219 days ago
It's a mixture-of-experts model. Basically N smaller model pieces put together, and when inference occurs, only 1 is active at a time. Each model piece would be tuned/good in one area.