|
|
|
|
|
by metadat
521 days ago
|
|
Thanks for the links, some interesting discussion there. The second article you linked indicates there will still be intense bandwidth requirements during training, shipping around gradient differentials. What has changed in the past year? Is this technique looking better, worse, or the same? |
|