Y
Hacker News
new
|
ask
|
show
|
jobs
by
ttul
928 days ago
See my poorly educated answer above. I don’t think that’s how MoE actually works. A new mixture of experts is chosen for every new context.