Hacker News new | ask | show | jobs
by tredre3 57 days ago
An idle GPU consumes almost nothing, a loaded (server-class) GPU can consume over 2kW.

Admittedly a single request isn't a full load, but claiming that a request makes no difference vs idle is misguided, in my opinion.

1 comments

OpenAI GPU wont be idle for long because they have all other requests to serve. Over time there will be a certain % of idle GPUs, amortized across all hundreds of millions of requests they receive.
And idle% is causally connected to whether you make a request or not, surely? I don't understand how your mental model works.