Hacker News new | ask | show | jobs
by surgical_fire 3 days ago
> Revenue is higher than cost of revenue and revenue is growing faster than cost of revenue.

That's not necessarily true.

It depends on two things that are impossible to derive from the leaked numbers:

1 - How much of their compute costs are subsidized.

2 - How much of that R&D chunk can actually be reduced for the company to continue working. What goes into "R&D"? Is that only things such as training for new models? This is impossible to determine from those numbers alone.

If those numbers are true, they would not be "close to profitability". They would be profitable, period.

This does not explain why they need to raise money like crazy. It would be possible to train new models at a more sustainable pace without the need to jack up prices at all, only with the difference in between "Revenue minus Cost of Revenue".

They also wouldn't need to be on a mad race to IPO and dump this into the public market.

To be frank, the more I look at those numbers, the more I think this is completely unsustainable.

2 comments

All model training is in R&D

> How much of that R&D chunk can actually be reduced for the company to continue working.

Worth noting that almost all engineering (including software engineering) is always included in R&D.

Development is important! Most software (and indeed engineering!) companies couldn't continue if they stopped R&D.

Ford couldn't continue as a company without R&D.

> All model training is in R&D

Yes. But is all R&D model training? That was the actual question.

Everybody that is presuming "inference is profitable" here is just naively deducting "cost of revenue" from "revenue" and running with that.

> Is that only things such as training for new models?

It feels like cost of revenue should account for training, but I'm not an accountant so who knows?

I'm not even entirely sure of this.

Obviously you need new training rounds over time. Knowledge is not static. New things that are created would need to be part of later models.

How do you account for that? Do you account for some sort of model depreciation?

A lot of things are very nebulous in those leaked numbers.

How is "cost of revenue" considered? If Microsoft, Oracle and such provides computing at a loss to Open AI, that is clearly not sustainable, and it is a way to pretend the numbers are better than they actually are prior to an IPO. This might even be a source of pressure for an IPO. As losses accumulate in the ecosystem, they need to dump it into the public so they can bring the numbers to reality.

Also, I am not sure if they own any compute (for example, the Stargate datacenters). If they do, do they lump the costs of building those data centers all in "R&D"? That would be one hell of a way to pretend that inference is cheaper, good luck having inference without those datacenters.

The more I look at this, the more it looks like "OpenAI is profitable when you pretend they have no costs".