Hacker News new | ask | show | jobs
by 0xbadcafebee 34 days ago
AMD, Alibaba should be on there too. AMD is making good money on AI, with R&D at less than half the AI revenue. Whereas Alibaba's weird financials show it's kinda-sorta-protifable?

I just wanna know how the OpenAI/Anthropic shell game works long-term. So both companies made equity deals with infrastructure providers; OpenAI on Azure, Anthropic on AWS, GCloud, and Colossus. They get a loan of compute credits and then pay for the compute with the credits. So the PaaS are effectively giving them free compute, then book it as revenue; and the AI provider lets them do inference and books that as revenue. So, it's like both types of company have a buffet, and let each other eat there for free. But somebody has to actually buy the pasta salad, with real dollars. Afaict, those real dollars are.... the cash reserves of the PaaS.

How long are they going to eat into that cash? Microsoft and AWS don't really have their own models, whereas Google and SpaceX do. And while Google has tons of cash, SpaceX is perpetually looking for cash. So the only player here that can actually afford to keep doing this, or leave the game entirely, is Google.

3 comments

With this line of thinking, nobody would have ever built refineries, or fabs, or clouds.

The frontier labs have fantastic margin on inference. You do not understand how fantastic. And they have license to change inputs at will based on profitability.

They are not only innovating on models and tooling, they are innovating on cogs (I wrote this btw, and I’m not going to stop writing this way because Claude discovered it’s brilliant).

Speaking of models, the cost of training is not scaling nearly as fast as demand for inference. Training used to be the biggest cost by far, now it’s not.

So margin is increasing, and guess what else is happening? Customers are finding value. And the customers that are finding value are also the ones who happen to have huge enterprise budgets.

And while this is happening, so is implicit collusion (and lock in, and hype, and all that). And so prices are going up.

They’re going to be just fine man, there is no inference bubble.

They can modulate supply. It’s all going to be fine. You should invest.

> The frontier labs have fantastic margin on inference. You do not understand how fantastic. And they have license to change inputs at will based on profitability.

This. The gross margin on inference is at least 95% if not higher - several open weight models on my tiny consumer DGX Spark easily replace the 15 dollars a day I was paying in tokens for Claw usage with a dollar a day electricity. You add data centre overhead and depreciation, the theoretical net margin will trend lower but depreciation is always far more aggressive than actual product degradation. The old NVIDIA GPU on a 9 year old second hand gaming PC I bought still serves up a small Gemma 4 variant quite reasonably.

To say nothing of the fact that they can just add "figure out how to change the answer to this question to benefit X" at the top of their system query.

It is baffling that any government lets either themselves or their local companies use these tools. Utterly baffling. The potential for total security compromise through these models is ... essentially 100%.

But ... it's slightly cheaper.

> The frontier labs have fantastic margin on inference.

Source?

The OpenAi filing will be very interesting indeed.

("trust me bro" statements from sama et al does not count, since I don't trust them)

Edit:

The best argument I have seen look at the price of inference from smaller companies running open models. And assuming they are profitable-ish. Their prices are lower than the OpenAi and Anthropics best models, so maybe they do make money on inference (ignoring all other costs)

This doesn't sound like Claude at all to me. Which is a good thing but wonder where that came from.
> Customers are finding value.

Where I can find confirmation of that in public sources?

Household names like Uber, Amazon, Blue Bottle Coffee and Fedex used the same playbook of burning investors’ cash for years and look where they’re now.

Everyone’s long term plan is hoping that they build out and survive long enough that, in the end, the market accepts them.

Even your local still-unprofitable restaurant is burning their grandparents’ inheritance money hoping that it works out.

But on the other hand, that’s what Theranos, WeWork, and Pets.com tried too.

The old model works at tens of millions, not hundreds of billions. All the private capital is pretty tapped out, and banks aren't loaning (thank god). So they don't have investor money to burn (and when they do, they immediately burn it on new datacenters, which usually take years to build and aren't a certainty). That's why they made equity deals with hardware companies... it was the only way they could "afford" hardware. But someone has to pay for that hardware. And the person paying is... the hardware companies. They have a lot of cash, but not hundreds of billions of cash. Hence why Oracle pulled out, Nvidia scaled back its investment. Claude only doesn't suck right now because SpaceX literally loaned them a datacenter. So I'm saying... these companies will run out of cash, if they can't get paid back, sooner than later.

When OpenAI goes public it will initially get a tsunami of cash, but it'll also be open to new risk due to the different operating model and transparency. Anthropic might not make it to an S-1 (this year). Even if they got a $30B infusion of cash each, based on their current spending projections, it doesn't cover half of what they need just to break even. In the meantime the PaaS's are holding the bag (and shedding cash).

So where's it going to end? To me, all of this (combined with inflation, degrading of reserve currency, war in middle east) is spookily similar to the railroad panic of 1873. Over-investment in new technologies leveraging too much from the largest financial institutions resulting in prolonged economic crisis. Our only saving grace now are laws ensuring banks have to cover their end; if your money's FDIC/SIPC insured you're safe. But all the businesses and individuals who aren't safe are gonna take a bath, which'll have systemic ripples. Afaict, Google is the only player who can survive all that and come out with profitable AI. (But I'm sure I've missed something because it seems too obvious)

They just have to incrementally raise the price of inference tokens and limit subscriptions to curtail existing demand (with much of it likely moving to slower and cheaper local models). Which, come to think of it, is exactly what seems to be happening right now.

> So they don't have investor money to burn (and when they do, they immediately burn it on new datacenters, which usually take years to build and aren't a certainty).

If AI models can get smarter and more practically useful via some combination of increased scale and more fine-tuned post-training on specific workloads (which is compute-heavy, even more than the usual kind of pre-training) these new datacenters are a fantastic investment.

They would have to raise the price of inference (and not just inference but actual contracts) 4x, within a year or so, for PaaS not to run out of cash. But nobody wants to pay that much money, and there are a swath of companies all over the world who will do it for much, much less. People will stick with them out of irrational fears (fear of the unknown, fear of missing out) until it gets too painful. And when people start leaving, what then? Anthropic's hope is that they can make a moat so high everyone is trapped. But it's actually not hard to replace Claude with a competitor.

They can get more efficient, but inference efficiency doesn't map linearly to cost efficiency. Firstly because software is a gas; if you give people more compute (for the same price), they immediately use it all up. But second, if you spend $50BN, you still have to make $50BN to break even. They could make inference cost $0.00000001, but that isn't going to cover their costs. That's what's driving their cost right now - they're trying to collect enough cash from people at the table to pay the bill, without the price scaring everyone out of the restaurant.

So they can't raise the price without scaring people off, and they can't lower the price and pay the bill.

> But nobody wants to pay that much money

People will want to pay that much once they're enabled to make the best and most efficient use of SOTA proprietary models for tasks that actually benefit from them, while using cheap third-party inference everywhere else. That's very different from what the leading AI firms are proposing right now and it does require some careful balance to get there from here, but it's absolutely doable.

The subscriptions will drop if they increase the price and if they don't increase the price, they will run out of cash to operate.

It's simply not feasible.

> The old model works at tens of millions, not hundreds of billions.

The first computers were the size of buildings, now look where we are. I think same thing will happen to AI models. We will have a reasoning core installed on our phone connected to Google's Knowledge Graph or Ontology project via API. These companies just need to survive long enough to make themselves irreplaceable in the new ecosystem.

There's a lot of revenue and outside investment coming in but the haters pretend it's all circular financing.
Outside investment is not a revenue, it's looking more like a pyramid scheme. It's yet more people putting their money in the scheme expecting a return

Either the actual revenue of paying customers ramp up or the bubble will pop at some point

I expect the paying customers will actually be companies buying ad, not people buying AI subscriptions