Hacker News new | ask | show | jobs
Every Flop Counts: Scaling 300B Moe LLMs Without Premium GPUs [pdf] (github.com)
2 points by mountainview 468 days ago