|
|
|
|
|
by 1R053
597 days ago
|
|
An H100 supports 80 GB of memory. so at FP8 that would allow 3 of the 16+1 models per GPU (assuming around 26B per model), requiring 9 H100s, that usually would not fit one chassis I guess. Once you have something with 192 GB it gets interesting. You could probably have 7 at FP8 per GPU. At FP16 it probably only would fit 3 per card, requiring 9 again. I'd say for the current memory layout of cards they missed a little bit the sweet spot.
With slightly smaller models or one expert less one should be able to run it on 8 H100s at FP8 or 2 B100s at FP8 or even on 4 B100s at FP16 if I calculated correctly. |
|