Hacker News new | ask | show | jobs
by stingraycharles 43 days ago
Do you know that MoE is a thing?
1 comments

The experts in MoEs aren't specialized in any meaningful task sense. From level of what we would think as tasks MoEs are selected essentially arbitrarily per token and per block.
It’s unsupervised, yes, but “unspecialized in any meaningful task sense” is incorrect, that’s the whole point. It’s just not in the sense of “this is a legal expert, this is a software developer”.
Optimal expert separation depends on the goal and can be pretty arbitrary, for example DeepSeek v4 separates them more or less by domain if I remember correctly.