| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by twoodfin 65 days ago
	From the limited perspective of software development, today’s models are well-worth their per-token cost. This reads to me like Anthropic anticipating demand and making a commitment to acquire supply. Not unlike airlines committing to future jet fuel purchases, or Apple committing to future DRAM volume.

3 comments

an0malous 65 days ago

> From the limited perspective of software development, today’s models are well-worth their per-token cost.

At the current price or real price? Anthropic said a $200 subscription can cost them $5000 so the real price could be anywhere from 10-30x the current price.

link

RealityVoid 65 days ago

No, that is probably one of the worst cases they probably saw. Most likely the subscription inference cost is much lower than you expect. If you look at costs for similar open models they are much lower than what you get by buying from anthropic, so that is the real cost basis I expect.

It's likely Amazon is making a fucking killing though.

link

SlinkyOnStairs 65 days ago

While $5000 is a lot, the people who rack up close or just over a thousand "API equivalent cost" are pretty common.

> Most likely the subscription inference cost is much lower than you expect.

This is probably not true because they'd be screaming it off every rooftop were that the case.

Same deal with the API inference. Even the "profitable on inference" claim is sourced back to hearsay of informal statements made by OpenAI/Anthropic staff. No formal announcements, nothing remotely of the "You can trust what I'm saying, because if I'm lying the SEC will have my head" sort.

Yet making such statements would be invaluable. If Anthropic can demonstrate profitability before OpenAI, they could poach most of the funding. There's no reason to keep it a company secret.

And API inference is only part of the total costs, not even bringing in training and ongoing fine-tuning. If they're not even profitable on inference, how could they hope to be profitable overall.

link

nielsole 65 days ago

I don't know about SEC rules but the anthropic CEO said they have a 50%+ margin on API pricing.

link

SlinkyOnStairs 65 days ago

I'm going to be a dickhead for a moment here, apologies, there's no way to say this that isn't rude to you. This is still the same hearsay "In an interview, somewhere."

A bit of google searching later can get us a specific interview. https://www.dwarkesh.com/p/dario-amodei-2

> Let’s say half of your compute is for training and half of your compute is for inference. The inference has some gross margin that’s more than 50%.

But the context, the very previous sentence is:

> Think about it this way. Again, these are stylized facts. These numbers are not exact. I’m just trying to make a toy model here.

Here, Amodei is in effect using weasel words. He is not giving any actionable claims about Anthropics margins, merely plucking an arbitrary number. Why 50%? Is 50% reasonable? Is 50% accurate to the company? Those are all conclusions the listener draws, not Amodei.

> I don't know about SEC rules

The main premise is that, as a CEO, there are some regulations you are beholden to. You're not allowed to announce you've made a trillion dollar profit, sell all your stock, and then go "teehee just kidding". The SEC prosecute you for securities fraud if you do that stuff.

This makes such weasel words as earlier suspicious. Because the exact statement Amodei gives is not prosecutable. He's not saying anything about the company, just doing a little "toy model".

The degree to which it is intentional that this hearsay travels and is extrapolated from "Well he picked 50% because it's a reasonable figure, and because he's CEO, a reasonable figure would have to be a figure akin to what his company can achieve" into "Anthropic has 50% margin", that's up for debate. Maybe it is intentional, maybe Amodei is exactly the same kind of shitweasel as Altman is. Probably he's just a dumbass who runs his mouth in interviews and for whatever reason cannot issue the true number in an authoritative statement to dismiss this misconception.

Hence my original comment; If the real number were better than the hearsay rumours of the number, Amodei would immediately issue a correction; It'd be great for the company. Hell, even if 50% were about the margin, that'd be great! To promote that from mere hearsay to "we're profitable, go invest all your money" would also be huge. Really, any kind of margin at all would put him ahead of OpenAI.

But he doesn't issue a correction. He doesn't affirm the statement. Perhaps he has other reasons for that, but a rather big reason could be that the margin number is in fact pretty bad.

Now, the observant reader will note I am also using a weasel word there. I do not know whether the number is good or bad, your take away should be "it could be bad." Not "it is bad". Go pressure Amodei into giving us the real number.

link

dminik 65 days ago

Interesting. So the 50%+ number that's been floating about isn't even real. It's just an example.

link

SlinkyOnStairs 65 days ago

Self reply as I could've explained the SEC thing better:

Anti-fraud regulators like the SEC give an inherent trustworthiness and credibility to CEOs and other market participants. You can trust that they're not lying to you, because they would be sent to jail if they were.

Another example are general anti-fraud regulations; Consider how one would trust North American or European steel suppliers more than Chinese steel suppliers.

It's not that the Chinese are "evil lying people" and Americans are "saints who never lie", it's that you can trust American, Canadian, and European courts to hold the liars accountable by regulations even if you're not in any of those regions. But the Chinese liars won't be held accountable by regulations.

Thus also the opposite, if someone opts out of this credibility granted to them by anti-fraud regulations, their words may not be quite so truthful.

link

stackskipton 65 days ago

SEC rules means CEO cannot lie or deliberately hide the cost of something.

50%+ Margin statements have basically been "We are making 50% on delivering it." This does not include ANY of the costs of getting to this point, training, scraping, datacenters, people and so forth.

They are basically saying "Oh yea, the cost of GAS in the car is only X so charging Y per mile is great margin" while ignoring maintenance, cost of acquiring the car and so forth.

link

postflopclarity 65 days ago

but comparing your margin of charging to drive a mile to the price of gas makes a lot of sense? that is the only variable cost in the equation. training / scraping / people are all pretty much fixed costs.

link

paradoxyl 64 days ago

That's a tad naive. CEOs can and have and often lied about everything:

Sam Bankman-Fried, Elizabeth Holmes, Kenneth Lay - and hundreds if not thousands more.

The SEC is a regulatory agency, not able to bring criminal charges. The above-named for the most part had to be prosecuted by the Department of Justice or sometimes state attorneys.

link

RealityVoid 65 days ago

> While $5000 is a lot, the people who rack up close or just over a thousand "API equivalent cost" are pretty common.

I think if you're not Anthropic and you don't have access to the actual data, then you can't say for sure. A bunch of anecdotes on terminally-AI people on twitter is not making a convincing case for me, IMO.

On the other hand, if similarly sized models cost much much cheaper than this, why, in the world, would Anthropic have much higher costs than that?

Also, counterpoint, maybe they want you to think that they have higher costs so you're more willing to actually pay for it?

link

PunchyHamster 65 days ago

The "worst case" is probably someone just using their $200 account limits. So yeah, real cost is probably close to that

link

kiratp 65 days ago

At the full current retail API price.

Business buyers are paying API prices, not subscription

Disclosure: Work at Microsoft on AI

link

an0malous 65 days ago

Are your API prices profitable?

link

svnt 65 days ago

And receiving investment from their vendor in exchange? When this is done in established companies it is typically called a kickback and directed toward one person, but in this case the whole thing is so incestuous the kickback goes straight to the top.

link

twoodfin 65 days ago

Is it crazy to imagine Anthropic can leverage short term cash flow now to build the models and products that will let them resell $100B in AWS infra with nice margins tomorrow?

If Amazon believes that story they’d be crazy not to invest.

link

svnt 65 days ago

Yes I understand why the agreement exists, but that does not remove the circularity.

link

sandworm101 65 days ago

But that per-token cost is a total joke. All these companies are fighting to build market share in some future dominated by one or two AI ecosystems. It is musical chairs until someone creates the one ring to rule them all. So they are charging token amounts just to claim revenue as they burn through investor dollars.

In short: per-token charges currently cover maybe 1% of the total costs in this field. To pay ongoing costs, and pay back investors, everyone will need to pay 100x or 1000x the current rates, likely for decades.

link

deaux 65 days ago

> In short: per-token charges currently cover maybe 1% of the total costs in this field

There are plenty of seemingly informed people saying the exact opposite, so that's a lot of confidence you're talking with. I have a hard time believing it when we know what open weights models cost to run. And sure, there's training costs, but again many say inference costs are already above training costs.

link

red_hare 65 days ago

If that's true, it's very unsustainable.

Gemma-4 26B-A4B + M5 MacBook Pro + OpenCode isn't Claude Code _yet_, but it's good enough that if I were forced to use it I would be fine.

link

jcgrillo 65 days ago

Yes, it's amazing how quickly so many tech companies have hitched their tooling to these big AI vendors seemingly without any thought towards whether they'll still exist a year or three or five from now. Insane behavior. To the (debatable!) extent that AI coding tools are useful at all wouldn't it be a hell of a lot smarter to self-host? At least that way you have some control over QoS, and a stable, predictable result... Or maybe nobody cares about that kind of thing anymore? What happened to basic business math in this industry?

link

twoodfin 65 days ago

The basic business math is (to start) software companies realizing that spending $10k, $20k, $50k (more ?) per year, per developer for current models at current token rates might not be particularly insane, given the value return.

Models are likely going to keep getting better, and as costs go down, demand is likely to rise faster.

link

jcgrillo 65 days ago

> as costs go down

Huh? Why would that happen? Indications are that costs will likely go up, especially if currently vendors are selling tokens at a loss.

link

twoodfin 65 days ago

The main operational expense of a million LLM tokens is pennies of electricity.

Even if you generously depreciate the GPU and other hardware, it’s hard to believe inference at scale in April 2026 isn’t highly profitable.

link

Danox 64 days ago

It’s getting better on both the hardware and the software fronts the barbarians are banging at the gates.

link

matrik 65 days ago

I'm not sure this information is grounded, but I remember to have read somewhere the inference is indeed profitable. My personal experience is similar. Running 2x3090s draw 500-600W and you can locally run amazing models with a similar setup.

link

sandworm101 65 days ago

Running the model isnt the cost. Watts per token is the math they show investors. You also have to be constantly training new models, which currently needs more compute than servicing the customer base. You have to biuld datacenters, and possibly powerplants to feed them. You have to carry debts. And you will need to buy new GPUs/ram every few years to remain competative. The total business is vastly different than simple gpu math.

link

paulddraper 65 days ago

You are in violent agreement.

> inference is indeed profitable

link

twoodfin 65 days ago

From the perspective of a deal like this, “total costs in the field” matter less than incremental cost per token served.

The unit economics for today’s frontier models should be great, and this suggests Anthropic believes they’ll get better.

link

postalrat 65 days ago

In a decade the cost of compute will be a tiny fraction of what it costs now. Specialized hardware will exist that will be cheap and efficient.

link

bitmasher9 65 days ago

The difference in the cost of compute between 2026 and 2036 won’t be nearly as large as the difference in the cost of compute between 2016 and 2026. Even at 2016 the slowdown in improvements was noticeable.

We might see a one time bump in inference when we move off GPUs onto more limited and efficient dedicated hardware, but the sustained fast pace of improvements are far behind us.

link

postalrat 65 days ago

I'm predicting now that there is a clear use-case for this tech that work will (and has) accelerate specialized hardware, software, models, etc that will run much more efficiently in 10 years. So that the real token costs will be a fraction of what they are now.

link

mchusma 64 days ago

You can run models on FPGAs and get massive cost, speed, and throughput gains (like 10x). The reason people don’t do it is because of other improvements (algorithmic) means that nobody really thinks locking into a model makes sense…yet. Would I want to use gpt 4o for anything today at 1/10th the price? That would be $0.40 per input, $1.50 per output. Gemma-4 31b is much more capable and cheaper. So a FPGA version of the model is just not worth it today.

But if progress begins to slow down, then the economics work. Maybe Gemma 4 is a good example. It feels really generally useful. Getting it at 1/10th the cost feels like it could be competitive in 2 years.

link

sandworm101 64 days ago

The fpga would be for prototyping. The real progress comes from asics ... exactly as we saw with bitcoin mining. This GPU-based approach will eventually give way to bespoke circuits once everyone picks a favorite model.

link

oceansky 65 days ago

Compute power improvement between 2016 and 2026 wasn't that impressive either. Moore's law is essentially dying.

link

jamesfinlayson 65 days ago

Yeah I went shopping for a new computer a couple of years ago (to replace a 7 year old computer) and... the specs for what was for sale were the same as what I bought 7 years prior, and the price wasn't much lower.

link

bitmasher9 65 days ago

I would much rather buy a 2026 computer than a 2019 computer. Two generations of Nvidia GPUs, Apple M series chips, the X3D AMD chips, and pcie5 ssds are all major upgrades.

It’s just that the pace of new stuff is slowing down, and many people are operating under the assumption that this wave will ride on forever.

link