|
|
|
|
|
by luc4sdreyer
1061 days ago
|
|
They claim 1.1x to 7x, depending on what you're doing. The 10% to 50% is for the ~10k GPU LLM training, where the main bottleneck tends to be networking: > DGX GH200 enables more efficient parallel mapping and alleviates the networking communication bottleneck. As a result, up to 1.5x faster training time can be achieved over a DGX H100-based solution for LLM training at scale. |
|