| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by dust42 160 days ago

If I followed the links correctly this benchmark was made on a 16xH200. At current prices I'd assume that is a system price of around $750,000.

The year has 86400*365 = 31536000 seconds. Thus 63072000000 tokens can be generated. As pricing is usually given per 1M tokens generated, this is 63072 such packages.

Now lets write off the investment over 3 years, 250,000/63072 = 3.96. So almost $4 per 1M tokens generated with prompt processing included.

Model was a Deepseek 671B 32B MoE.

Looks to me that $20 for a month of coding is not very sustainable - let's enjoy the party while VCs are financing it! And keep an eye on your consumption...

Electricity costs seem negligable with ~$10,000 per year at 10cts per kWh but overall cost would be ~10% higher if electricity is more like 30cts like it is in Europe.

Edit: like it is pointed out by other commenters it is 2200t/s per single GPU thus the result needs to be divided by 16: $4/16 = $0.25. This actually somewhat matches the deepseek API pricing.

6 comments

menaerus 159 days ago

How did you arrive to $10,000 electricity costs figure?

8xH200 enclosed in DGX H200 system power draw is ~14kW in its peak (CTS) configuration/utilization. Over one year, and assuming maximum utilization, this is 123,480 kWh per single DGX H200 unit. We need 2x such units for 16xH200 system configuration under subject so it's 246,960 kWh/year. This is ~$25,000 at 10cts per kWh and ~$74,000 at 30cts per kWh. At ~1,110,000 1M batches this gives us: (1) ~$0.02 - $0.07 per 1M of energy cost and (2) ~$0.25 per 1M assuming the same HW depreciation rate. In total, this is ~$0.3 per 1M tokens.

Seems sustainable?

dust42 159 days ago

I used 700W per H200 = 11.2 per 16 GPUs. I didn't include CPU and rest of the rack. So yours is a better approximation.

One has to keep in mind that the benchmark that was done is synthetic. This makes sense because it makes it reproducible but real world usage may differ - i.e. by the amount of context and the number of concurrent users. Also there are use cases where smaller models or smaller quants will do.

The key take away for me for this type of back of the envelope calculation is to get a good idea where we stand long term, i.e. when VC money stops subsidizing.

So for me $0.3 per 1M tokens for a decent model looks pretty good too. Seeing that OpenAI API charges $21 per 1M tokens input and $168 output for GPT-5.2 pro I was wondering what the real sustainable pricing is.

yorwba 160 days ago

It's 2.2k tokens per second and GPU, so you have to multiply the token output by 16 and the price per million tokens works out to 22.5 cents.

aurareturn 159 days ago

I think they're also running this at 16 bit quant. If they lower it to 8bit, they might double their output which might come out to be 11 cents per million tokens.

Now take into account that modern LLMs tend to use 4bit inference, and Blackwell is significantly more optimized for 4 bit, we can see much less than 11 cents. Maybe a speed up of 5x if using 4bit and Blackwell vs H100 and 8 bit?

So we're looking at potentially 2.2 cents per million tokens.

Palmik 159 days ago

APIs are usually very profitable. As for subscriptions, it would depend on how many tokens average subscriber uses per month. Do we have some source of info on this?

Some notes:

- # Input tokens & # output tokens per request matters a lot.

- KV Cache hit rate matters a lot.

- vLLM is not the necessarily most efficient engine.

- You are looking at API cost for DeepSeek V3.2, which is much cheaper than DeepSeek R1 / V3 / V3.1. DeepSeek V3.2 is different architecture (sparse attention) that is much more efficient. DeepSeek V3 cheapest option (fp8) tends to be ~$1/mil output tokens while R1 tends to be ~$2.5/mil (note that for example Together AI charges whopping $7/mil output tokens for R1!)

As for the cost: You can also get H200s for ~ $1.6/hr and H100s for ~ $1.2/hr. That somewhat simplifies the calculations :)

Ignoring the caveats and assuming H200s, with their setup you will:

- Process 403200000 input tokens.

- Generate 126720000 output tokens.

- Spend $25.6.

- On Together with DS R1 it would cost you $3 * 403.2 + $7 * 126.7 = ~$2096. Together does not even offer discount for KV cache hits (what a joke :)).

- On NovitaAI with DS R1 it would cost you $0.7 * 403.2 + $2.5 * 126.7 = ~$600 (with perfect cache hit rate, which gives 50% discount on input tokens here, it would be ~$458).

kicks66 160 days ago

I think you missed something here - its 2.2k tokens _per_ GPU

So if you work that through its $0.225 per 1M output tokens.

supermatt 160 days ago

> more like 30cts like it is in Europe

Nope - i live in one of the most expensive areas, and even the residential price has averaged 18c/kWh delivered including taxes. Businesses get a lower basic rate and also don't pay the VAT, so it works out around 13c/kWh for them.

https://data.nordpoolgroup.com/auction/day-ahead/prices?deli...

t0mas88 160 days ago

That's excluding tax, net prices around 0.20-0.30 EUR / Kwh we common.

supermatt 160 days ago

I updated my comment to include my personal delivered rate including VAT - also note that businesses (like a data center) don't pay the VAT and have substantially reduced delivery fees at high voltage

ffsm8 160 days ago

Then you're living in one of the cheapest areas for electricity prices in Europe, the opposite of what you said.

https://ec.europa.eu/eurostat/statistics-explained/index.php...

Scroll a little down and you see a breakdown by country

E.g.

https://ec.europa.eu/eurostat/statistics-explained/index.php...

supermatt 159 days ago

I am in Lithuania, which has one of the highest wholesale energy prices in Europe (as per nord pool): https://data.nordpoolgroup.com/auction/day-ahead/prices?deli...

That it is not translating into a higher cost to the consumer (as evidenced on your link) is likely indicative of other costs being incurred by the “average” consumer in those countries with a higher domestic rate - like massive markup from users being tied into inflated contracts due to the 2022 shock where rates across Europe were more than double what they are now.

Also, these are residential prices - business prices are usually much lower (wholesale discounts, subsidies, no VAT, lower delivery charges).

As per my response to the initial comment - there is no way a datacentre in Europe is paying 30c/kWh

ffsm8 159 days ago

Business prices should be figure 6 in my link, while the difference is a lot smaller, Lithuania is definitely one of the cheaper countries, beating the EU average slightly.

> As per my response to the initial comment - there is no way a datacentre in Europe is paying 30c/kWh

Hetzner prices it at 33c/wh as of last year I believe, previously it was 40c (after the pipeline was destroyed)

But Germany is pretty much in the 3 most expensive countries wrt electricity cost in the EU - both for consumers and commercial pricing

menaerus 159 days ago

Some countries also employ progressive electricity pricing such that higher energy consumption leads to elevated kWh rates incentivizing conservation. This is also not visible in the stats above. I also think that business kWh rates are actually higher than for the households in some instances.

bonoboTP 159 days ago

Net is excluding tax, you mean gross.

glemion43 159 days ago

Private / endconsumer in Germany is 34

edf13 160 days ago

> let's enjoy the party while VCs are financing it!

The VC money is there until they can solve the optimization problems