Hacker News new | ask | show | jobs
by vfalbor 13 days ago
This is a very interesting comment. Companies like OpenAI or ChatGPT sell hardware hidden in tokens, and the token is different for each company depending on the tokenizer. The concern is this: when you have an Opus 4.7, Sonnet, or GPT 5X with an Nvidia H100 or H200 GPU, what will happen to this cost when, if not Nvidia, another Chinese company enters the market and starts running these models? The point here is that as long as Nvidia is the provider, and limits access to the machines and the number of data centers is also limited, these companies can be worth whatever they want. But the moment this starts to expand, the value will surely decline, because what you're selling isn't the model itself, which is ultimately just a 1 TB file that you have replicated across machines. What you're selling is access to a software program on a specialized machine. As long as you control the resource, which in this case is that machine, you'll have value. The moment other machine manufacturers enter the market, your value will decrease.
2 comments

If you check openrouter there are a tons of providers selling API access to open source LLMs at a fraction of the cost compared to SOTA models (codex/claude). What model you're serving and what kind of platform you serve is a big factor.

I'm no expert but I think eventually we'll have even more specialized ASIC like machines with models burned into them and a that will absorb a chunk of the market, similar to what happened to crypto mining but to a lesser degree since the work isn't as static.

NN-specific ASICs won't buy you much more FLOPs per watt than GPUs/TPUs will. These chips are already extremely good at NN computation. Sure, you could remove GP shader support and free up 5% of your die for a few more cores (which btw is what TPUs pretty much are), but that's about it.

Either way, you'll still be starving for data.

The best work in this area is memory-integrated Big-Ass-Die or Big-Ass-Chiplet solutions like Cerebras which park SRAM right next to your cores, not ASICs.

>but I think eventually we'll have even more specialized ASIC like machines with models burned into them

This has already happened and is very interesting.

https://www.anuragk.com/blog/posts/Taalas.html

If that were the case, it would be reasonable to expect that companies like OpenAI or Anthropic, which are heavily indebted, would lose part of their business model, not because their models are bad, but because others will be cheaper and not as bad.
I think he means they will be commercially relevant and most AI compute won't be on GPUs.
Or the AI labs just take an increased margin? The model is ultimately what people want access to and are paying for