Hacker News new | ask | show | jobs
by _flux 3 hours ago
I think this comes from the idea that serving these tokens without paying for training is already expensive, e.g. https://news.ycombinator.com/item?id=46613887 self-hosted solution might give you only 10-100x more affordable solution at cost.

So, given the SOTA providers with even larger models also need to continously be using considerable resources for training their next models, to fund future data centers, and make profit, the token costs are more likely reflecting the real costs, rather than the subscription costs.

1 comments

Except there are plenty of inference providers worldwide (including the US) that serve open-weight models that are not subsidized, and are reasonable in cost. Or is your claim that those are all running at a loss?
So they do not train models, and in addition their models are expected to be smaller than SOTA models, although we cannot know for sure by how much.

So what's the price difference, 3000x?

My comment is about your statement "serving these tokens without paying for training is already expensive"...

One thing we do know from OpenAI's leaked financial document is that they are already profitable on inference, though that data is not broken down by cost and revenue of API vs. subscription. One important factor is that subscription inference can be optimized in ways to reduce cost (e.g., usage limits, batch optimization around API-prioritized inference, etc...). I think simply we do not know the actual cost of subscription interference for SOTA models.