Hacker News new | ask | show | jobs
by simianwords 14 days ago
> until inference becomes much cheaper these companies cannot be profitable. Some mega-players will pay the API token price, but most will not.

This is often repeated but comes from ignorance mostly. You have * zero * reason to believe inference is costly other than just vibes. If you go by data and intuitions - the margins are high.

This kind of thinking really reinforces my belief that people have no idea and are using this whole [AI is not profitable and too costly] thing as a cathartic way to deal with immense progress.

2 comments

We know that inference cost is very significant, as he shows for example in this piece.

https://www.wheresyoured.at/oai_docs/

However, it needs to be said that he received those numbers. I personally have quite a few issues with him, but there's no reason to doubt his journalistic integrity. Because of that, I believe he reports truthfully on data he receives by informants.

Additionally, none of the frontier models actually publicly talks about inference costs in anything but broad, "let's just forget that"-like takes. Which does not exactly spark confidence.

I'm eagerly awaiting anthropic's public disclosure of their financial details. That should be rather interesting in any case and finally put the inference-discussion to rest.

No reason to doubt his journalistic integrity? He's not a journalist for starters. He's a PR flack who does PR for AI startups on the side while blogging on substack. There is every reason to doubt his journalistic integrity.
The PR-thing was always openly communicated by him and is not some secret or gotcha. It's essentially "fleecing the boosters", which I fully approve of and do similarly myself.

I'll gladly tell my customers all the most glorious stuff about AI and big tech while spending a significant chunk of the money they pay me on supporting AI-/tech-counterculture, such as doctorow, zitron and quite a few other writers, journalists and activists.

It's the old "you live in a society" counter-point against anti-capitalist activism. Needing to make ends meet does not imply that your points or principles are meaningless, it just implies that you have no interest in being homeless and that way losing your chance to actually change things.

So that's fine to me. But: I stated it for a reason, because I know others don't agree. I, personally, consider him trustworthy. You do not, and that's fine. I suspect we both await anthropic's Z.1, which will be able to settle a big chunk of the debate.

If he is right, the numbers will show it.

Why do you consider him trust worthy when sooo many of his predictions are false?

https://news.ycombinator.com/item?id=48447549

He was right about the cost changes, which he predicted quite some time ago. People shouted at him that he was making it all up - yet it was correct.

He was also right about AI-video and sora in particular being a fundamentally flawed idea.

He was also right about the dangers and problems with the general inaccuracy of LLMs and people relying on it.

Also about the expected triggering of ROI-checking in companies, such as Uber is doing now. His prediction is, ROI is negative. And I'm awaiting the society's consensus on that.

The general direction seems correct to me. He's not a technical guy and does not have the knowledge to critique models on a factual basis. I do wish he'd just focus on the stuff he _does_ know about, which is the financial side of things.

He is a much needed counterweight to the unhealthy hype going around, imho.

> He was also right about AI-video and sora in particular being a fundamentally flawed idea.

He specifically predicted that AI videos have plateaued in 2024 which is egregiously wrong.

> He was also right about the dangers and problems with the general inaccuracy of LLMs and people relying on it.

He specifically predicted that accuracy won't increase but accuracy has increased over the time significantly to the point where you can't get it to say anything inaccurate using the reasoning models.

> Also about the expected triggering of ROI-checking in companies, such as Uber is doing now. His prediction is, ROI is negative. And I'm awaiting the society's consensus on that.

The whole Uber skepticism is a good point because all of those people were wrong and Uber is profitable now.

You didn't address my other criticisms - he claimed that revenue would drop in 2024 and it skyrocketed. He claimed that users weren't interested in ChatGPT but now it has a billion users (6x jump).

> You have * zero * reason to believe inference is costly other than just vibes. If you go by data and intuitions - the margins are high.

1. What data?

2. Intuitions = vibes.

Vibes are bad when used against you, but good when used in your favor.

Come on :-)))

I have the data here and intuition https://simianwords.bearblog.dev/conclusive-proofs-that-llm-...

But if you don't believe me, lets have a bet based on what the IPO filings show?

Remember that OpenAI is subsidized from here to the highway.

A better way to model this, since you seem interested is the following:

How much would it cost you to start such a service for, say, 10k users?

Any other internet service has had virtually Zero cost, $0. Google, Facebook, youtube, Wikipedia, you name it. They all went into the dumpster to pick up a thrown away desktop computer, and they could serve up towards 100k if not a million users.

How much would it cost you to serve, say, 10k simultaneous users with a SOTA model? And if you wanted to go cash positive after a year, how much would each user have to pay?

> How much would it cost you to serve, say, 10k simultaneous users with a SOTA model? And if you wanted to go cash positive after a year, how much would each user have to pay?

My post has this same argument - we have multiple third party companies running open weight models. They are obviously not subsidised. And people are willing to pay for it. And these models are as good as the SOTA models from last year. So this kinda proves my point that SOTA is sustainable.

I didn't find the answer there, that's why I asked.

What hardware is needed, how much of it, cooling, and what does it all cost you?

Or are you saying I can take my old desktop and serve Deepseek v3.2 to 10k users simultaneously and it would cost me about $1 per megatoken?

I'm simply saying this: there are third party hosters of Open Weight models like deepseek and they have been doing this for a while.

Obviously they are not subsidised, do you disagree? If you agree, they have a way to price it at a point that people wanna pay for it and also they aren't losing money.

So there's nothing inherent about inference that makes it too costly or whatever.