Hacker News new | ask | show | jobs
by elpocko 367 days ago
Okay but these tiny models are being used by people and businesses instead of GPT-4. My point was that they consume less energy per user than a rig used for gaming.

I have no insight into how many GPT-4 users are served per GPU, but I would assume OpenAI heavily optimizes for that, considering the cost to run that thing. It's probably in the same ballpark: hundreds-thousands of concurrent user requests per GPU. Still better than one GPU per gamer, even if it requires 10x the energy.