Hacker News new | ask | show | jobs
by margalabargala 53 days ago
Extensively discussed elsewhere in this thread. Just start at the top and start reading comments.
1 comments

Can you summarise? I only reached your comment after scrolling past all the others and I still don't have the answer.

Is the new data that models are more useful for coding than they once were?

Cost of tokens goes down over time. Like by a lot. And it will continue to do so.

Imagine being in 2003 and saying compute costs won’t go down. That’s Ed lol.

EDIT: Some quick research on this so you guys have actual numbers: https://gist.github.com/dwaltrip/a037be938d2b5ecc8b8b238736e....

There's multiple separate angles that all contribute to token-costs going down: chip improvements, engineering improvements for running inference in general, AI architecture and training advances that give similar intelligence in a smaller model, improvements in the quality of the training data, data center design / economies of scale, networking and rack-level improvements that are multiplicative with chip advancements, and so on...

If you analyze the situation for 5 minutes, it's blindingly obvious that price-per-token will continue to improve. And there's a very similar case for intelligence-per-token as well.

And don't get me wrong -- I have many concerns about how this is all unfolding and how it will impact society. But let's get our basic facts straight.

I read through some of the sources in your link, but they don't paint as clear a picture as you claim. Yes, the cost of inference appears to be coming down, but we so far don't really know why that is and what the largest contributing factors are. With other costs rising (e.g. the rising cost of training, the cost of inference scaling with number of parameters, and reasoning models using more and more tokens), it means we can't yet make any certain claims about long-term economic viability. There just isn't enough data yet.

Taking a look at the sources in your link, the MIRI's "Observations About LLM Inference Pricing"report [0] seems like one of the least biased ones (forgive me if I don't believe everything a16z has to say about the economics of AI).

Some choice quotes from the report:

"Imagine you went to the gas station and the price was $4.00, and you look across the street at another gas station and the price is $40.00 — that’s basically the situation we currently see with LLM inference."

"Overall, LLMs do not appear to be priced like other commodities."

"It's possible that some providers are slightly modifying a model that they are serving for inference, for example by quantizing some of the computation"

"Unfortunately, it is difficult to make strong conclusions about the underlying costs of LLM inference because prices range substantially across providers. The data used in this analysis is narrow, so I recommend against coming to strong conclusions solely on its basis."

Another source you linked, Don't Panic Labs[1], seems to agree with Zitron:

"It is a little unclear as to why the price per token is dropping, and I am still a little worried that the price per token will, at some point, go up."

"According to another graph at Epoch AI, the cost to train a model doubles every eight months. This tends to align with the common wisdom that we are getting a really good deal right now, while everyone is fighting to build market share."

If your inference costs come down due to quantization, that doesn't count, since you're cutting costs by offering a worse service, and there's only so much you can do that before your customers walk away. If your inference costs come down due to subsidization, that doesn't count either, since that obviously won't last forever. If your inference costs come down but your training costs double every eight months, that poses a significant problem for your business. If your argument to that is "training costs won't continue to increase at this rate forever". Well, inference costs won't continue to come down at this rate forever, either.

From what I can tell, there still isn't enough data to draw a strong conclusion either way.

[0]: https://techgov.intelligence.org/blog/observations-about-llm...

[1]: https://dontpaniclabs.com/blog/post/2025/12/02/the-price-per...

That sounds like a reading comprehension skill issue? In which case I don't see why me summarizing would move the needle.

But if it helps, no, the data being discussed is surrounding the economics of running inference and R&D, nothing to do with the utility of models for coding.

Yours is the first from the top to mention this. You might want to consider the physical location of your comment before telling people to read the thread. We could do without the rudeness, too.