Hacker News new | ask | show | jobs
by twoodfin 55 days ago
The main operational expense of a million LLM tokens is pennies of electricity.

Even if you generously depreciate the GPU and other hardware, it’s hard to believe inference at scale in April 2026 isn’t highly profitable.

1 comments

> The main operational expense of a million LLM tokens is pennies of electricity.

I think you meant dollars of electricity.

I don’t think so.

https://www.theregister.com/2024/03/18/nvidia_turns_up_the_a...

A Blackwell 8X node consumes about 15kw, let’s up that to 50kw to generously account for cooling and everything else.

A US kWh is something like $0.20, so running that node for an hour costs ~$10.

Nvidia got 30,000 parallel TPS out of DeepSeek-R1 on that node:

https://developer.nvidia.com/blog/nvidia-blackwell-delivers-...

So that $10 buys you over 100M tokens or … pennies per million.

I’m sure these numbers are off, but not by an aggregate two orders of magnitude.