|
|
|
|
|
by jsheard
641 days ago
|
|
Does it actually work? AIUI the current consensus is that you need massive interconnect bandwidth to train big models efficiently, and the internet is nowhere near that. I'm sure the Nvidia DGX boxes have 10x400Gb NICs for a reason. |
|
1. https://arxiv.org/abs/2311.08105
2. https://github.com/NousResearch/DisTrO?tab=readme-ov-file