Hacker News new | ask | show | jobs
by AmazingTurtle 125 days ago
Doubling speed can likely come from MoE optimizations such as reducing the amount of active parameters.