Hacker News new | ask | show | jobs
by bayindirh 439 days ago
An nVIDIA H200 uses around 2.3x more power (700W) when compared to a Xeon 6748P (300W). You generally put 8 of these cards into a single server, which adds up to 5.6KW, just for GPUs. With losses and other support equipment, that server uses ~6.1KW at full load. Which is around 8.5x more when compared to a CPU only server (assuming 700W or so at full load).

Considering HPC is half CPU and half GPU (more like 66% CPU and 33% GPU but I'm being charitable here), I expect an average power draw of 3.6KW in a cluster. Moreover, most of these clusters run targeted jobs. Prototyping/trial runs use much limited resources.

On the other hand, AI farms use all these GPUs at full power almost 24/7, both for training new models and inference. Before you asking, if you have a GPU farm which you do training, having inference focused cards doesn't make sense, because you can divide nVIDIA cards with MIG, so you can put aside some training cards, divide these cards to 6-7 and run inference on them, resulting ~45 virtual cards for inference per server, again at ~6.1KW load.

So, yes, AI's power load profile is different.