Hacker News new | ask | show | jobs
by selcuka 310 days ago
We are talking about accuracy, though. I don't see the point of MoE if a 120B MoE model is not as accurate as even a 32B model.