Hacker News new | ask | show | jobs
by ttul 928 days ago
See my poorly educated answer above. I don’t think that’s how MoE actually works. A new mixture of experts is chosen for every new context.