Hacker News new | ask | show | jobs
by comp_raccoon 205 days ago
Olmo author here! Qwenmodels are in general amazing, but 30B is v fast cuz it’s an MoE. MoEs very much on the roadmap for next Olmo.