Hacker News new | ask | show | jobs
Zero and DeepSpeed: system optimizations allow training models 100B parameters (microsoft.com)
1 points by polymorph1sm 2092 days ago