Hacker News new | ask | show | jobs
by zozbot234 68 days ago
MoE has made it vastly easier to increase total parameters (and recent open models are really quite large) but it's also hard to compare a MoE with an earlier dense model.