Y
Hacker News
new
|
ask
|
show
|
jobs
by
frde_me
81 days ago
Aren't you describing why they use mixture of experts? Where a sub-set of weights are activated depending on the query?