|
|
|
|
|
by sashank_1509
948 days ago
|
|
I think your math is also slightly off, in the Google article, it claims
“that is capable of achieving 10 exa-FLOPs (16-bit).” , so you should be comparing with 16 bit operations from a H100. 989 is TF32 core, for 16 bit it is 1979, so I guess around 5000 H100’s in a single training job would be equivalent to the training job mentioned in this article. Either way I actually would not be surprised if OpenAI has launched a single job on more than 10k GPU’s, but I also am not very knowledgeable on practical scaling. Congrats on the feat! |
|