Hacker News new | ask | show | jobs
by samus 773 days ago
MoE is mostly used to enable load balancing since it makes it possible to put experts on different GPUs. This isn't so easy to do with a monolithic, but sparse layer.