Hacker News new | ask | show | jobs
by FL33TW00D 1919 days ago
Huggingface has been working on implementing this into their library, and it has some pretty amazing effects on the size of models you can train on a simple Colab.

https://huggingface.co/blog/zero-deepspeed-fairscale