|
|
|
|
|
by spott
699 days ago
|
|
I expect that the AMD also looses out when multigpu starts to be required for it (which is arguably going to be for much larger models than for the h100, but a 70B parameter model with bf16 training is going to hit multigpu in terms of memory requirements) as their interconnect is just way slower. |
|