|
|
|
|
|
by schobi
9 days ago
|
|
I guess this was more related to syncing GPUs. If you were to take 500 computers with older 1080 GPUs, you might have enough compute/ram equivalent to an H200 GPU for training such a model. Maybe take 10000. But if those machines are spread over 10000 homes, wired with residential internet service, training a large model will not get anywhere. You go from "data in the same HBM memory chip" at 4.8TB/s or "data in adjacent GPU" with NVlink at 1.2 TB/s down to 25 MBit/s upload speed. Accessing the next piece of data is going to be about a Million times slower.
At the same time you will heat a thousand times more, for a Million times longer. |
|