| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by surajkumar5050 140 days ago

I think two things are getting conflated in this discussion.

First: marginal inference cost vs total business profitability. It’s very plausible (and increasingly likely) that OpenAI/Anthropic are profitable on a per-token marginal basis, especially given how cheap equivalent open-weight inference has become. Third-party providers are effectively price-discovering the floor for inference.

Second: model lifecycle economics. Training costs are lumpy, front-loaded, and hard to amortize cleanly. Even if inference margins are positive today, the question is whether those margins are sufficient to pay off the training run before the model is obsoleted by the next release. That’s a very different problem than “are they losing money per request”.

Both sides here can be right at the same time: inference can be profitable, while the overall model program is still underwater. Benchmarks and pricing debates don’t really settle that, because they ignore cadence and depreciation.

IMO the interesting question isn’t “are they subsidizing inference?” but “how long does a frontier model need to stay competitive for the economics to close?”

6 comments

jmalicki 140 days ago

I suspect they're marginally profitable on API cost plans.

But the max 20x usage plans I am more skeptical of. When we're getting used to $200 or $400 costs per developer to do aggressive AI-assisted coding, what happens when those costs go up 20x? what is now $5k/yr to keep a Codex and a Claude super busy and do efficient engineering suddenly becomes $100k/yr... will the costs come down before then? Is the current "vibe-coding renaissance" sustainable in that regime?

slopusila 140 days ago

after the models get good enough to replace coders they will be able to start increasing the subscriptions back up

jmalicki 140 days ago

At $100k/yr the joke that AI means "actual Indians" starts to make a lot more sense... it is cheaper than the typical US SWE, but more than a lot of global SWEs.

HPMOR 140 days ago

No - because the AI will be super human. No human even at $1mm a year would be competitive with a $100k/yr corresponding AI subscription.

See people get confused. They think you can charge __less__ for software because it's automation. The truth is you can charge MORE, because it's high quality and consistent, once the output is good. Software is worth MORE than a corresponding human, not less.

jmalicki 139 days ago

I am unsure if you're joking or not, but you do have a point. But it's not about quality it's about supply and demand. There are a ton of variables moving at once here and who knows where the equilibrium is.

skeptic_ai 139 days ago

If we have 2-3 competitors and open sourced ones that are 90% there I think it’s hard to get so big margins.

raincole 140 days ago

> the interesting question isn’t “are they subsidizing inference?”

The interesting question is if they are subsidizing the $200/mo plan. That's what is supporting the whole vibecoding/agentic coding thing atm. I don't believe Claude Code would have taken off if it were token-by-token from day 1.

(My baseless bet is that they're, but not by much and the price will eventually rise by perhaps 2x but not 10x.)

BosunoB 140 days ago

Dario said this in a podcast somewhere. The models themselves have so far been profitable if you look at their lifetime costs and revenue. Annual profitability just isn't a very good lens for AI companies because costs all land in one year and the revenue all comes in the next. Prolific AI haters like Ed Zitron make this mistake all the time.

jmalicki 140 days ago

Do you have a specific reference? I'm curious to see hard data and models.... I think this makes sense, but I haven't figured out how to see the numbers or think about it.

BosunoB 140 days ago

I was able to find the podcast. Question is at 33:30. He doesn't give hard data but he explains his reasoning.

https://youtu.be/mYDSSRS-B5U

majewsky 139 days ago

> He doesn't give hard data

And why is that? Should they not be interested in sharing the numbers to shut up their critics, esp. now that AI detractors seem to be growing mindshare among investors?

lilytweed 140 days ago

In his recent appearance on NYT Dealbook, he definitely made it seem like inference was sustainable, if not flat-out profitable.

https://www.youtube.com/live/FEj7wAjwQIk

rstuart4133 140 days ago

> It’s very plausible (and increasingly likely) that OpenAI/Anthropic are profitable on a per-token marginal basis

There any many places that will not use models running on hardware provided by OpenAI / Anthropic. That is the case true of my (the Australian) government at all levels. They will only use models running in Australia.

Consequently AWS (and I presume others) will run models supplied by the AI companies for you in their data centres. They won't be doing that at a loss, so the price will cover marginal cost of the compute plus renting the model. I know from devs using and deploying the service demand outstrips supply. Ergo, I don't think there is much doubt that they are making money from inference.

deaux 139 days ago

> Consequently AWS (and I presume others) will run models supplied by the AI companies for you in their data centres. They won't be doing that at a loss, so the price will cover marginal cost of the compute plus renting the model.

This says absolutely nothing.

Extremely simplified example: let's say Sonnet 4.5 really costs $17/1M output for AWS to run yet it's priced at $15. Anthropic will simply have a contract with AWS that compensates them. That, or AWS is happy to take the loss. You said "they won't be doing that at a loss" but in this case it's not at all out of the question.

Whatever the case, that it costs the same on AWS as directly from Anthropic is not an indicator of unit economics.

waffletower 140 days ago

In the case of Anthropic -- they host on AWS all the while their models are accessible via AWS APIs as well, the infrastructure between the two is likely to be considerably shared. Particularly as caching configuration and API limitations are near identical between Anthropic and Bedrock APIs invoking Anthropic models. It is likely a mutually beneficial arrangement which does not necessarily hinder Anthropic revenue.

freakynit 139 days ago

Genuine question: Given Anthropic's current scale and valuation, why not invest in owning data centers in major markets rather than relying on cloud providers?

Is the bottleneck primarily capex, long lead times on power and GPUs, or the strategic risk of locking into fixed infrastructure in such a fast-moving space?

barrell 139 days ago

> It’s very plausible (and increasingly likely) that OpenAI/Anthropic are profitable on a per-token marginal basis

Can you provide some numbers/sources please? Any reporting I’ve seen shows that frontier labs are spending ~2x on inference than they are making.

Also making the same query on a smaller provider (aka mistral) will cost the same amount as on a larger provider (aka gpt-5-mini) despite the query taking 10-100x longer on OpenAI.

I can only imagine that is OpenAI subsidizing the spend. GPUs cost by the second for inference. Either that or OpenAI hasn’t figured out how to scale but I find that much less likely

w10-1 140 days ago

"how long does a frontier model need to stay competitive"

Remember "worse is better". The model doesn't have to be the best; it just has to be mostly good enough, and used by everyone -- i.e., where switching costs would be higher than any increase in quality. Enterprises would still be on Java if the operating costs of native containers weren't so much cheaper.

So it can make sense to be ok with losing money with each training generation initially, particularly when they are being driven by specific use-cases (like coding). To the extent they are specific, there will be more switching costs.