Hacker News new | ask | show | jobs
by onlyrealcuzzo 27 days ago
Well, AI costs are definitely going to go down at least 90% in the next ~18 months for the same quality of output (and probably 90% again in the 24 months after).

Are you sure it's going to make sense to pay someone to do that moving forward?

I don't think it's worth it now, by the way.

It's definitely not going to be worth it in the near future.

Can we even blink for $0.002? What happens when the next 90% increase in efficiency happens??

4 comments

> Well, AI costs are definitely going to go down at least 90% in the next ~18 months for the same quality of output (and probably 90% again in the 24 months after

As far as I can see, token costs have been steadily increasing over the past few months, so I’m not sure that buying the hype that another 90% cost reduction is just around the corner is warranted.

Doesn’t seem like token costs, specifically, are increasing.

Opus cut its token pricing by 66% 6 months ago and it had previously been that higher price consistently for a year and a half (since that model launch).

GPT’s latest model is harder to track since it’s not named, but it’s historically inline with its history.

Not to mention what’s happening with other models like DeepSeek, GLM, and Kimi.

It seems to me the bigger change in costs is based on token appetite. People are discovering agentic capabilities are stronger than they used to be and use cases have broadened because of that. They’ll eventually discover too that these alternative models offer 95% of the intelligence at 20% of the price.

Is the price reduction in the room with us right now?
On local models that cost power (post initial hardware cost), makes sense. My work is building this out and I think it's solid. But until we can use our own hardware and local models the long term cost is a big question mark.
Would like to know where that 90% number comes from, and if it matches historical trend.
https://www.reddit.com/r/LocalLLaMA/comments/1gpr2p4/llms_co...

See Chart 13 here: https://www.rdworldonline.com/ais-great-compression-20-chart...

See here: https://epoch.ai/data-insights/llm-inference-price-trends

LLMs are so comically inefficient compared to the human brain that it is pretty easy to imagine this trend continuing for several more 90% drops.

If LeCun's JEPA or GRAM turn out to be a thing, we could see a 3-4 order of magnitude drop in a single release cycle / generation.

Keep in mind that performance per watt on the hardware side - at the same time - is still doubling every ~24 months - and this doesn't factor that in.