Hacker News new | ask | show | jobs
by jsnell 381 days ago
> What? If someone builds something on top of your API, they're tying themselves to it, and you can slowly raise prices while keeping each increase well below the switching cost.

That's not really how the LLM API market works. The interfaces themselves are pretty trivial and have no real lock-in value, and there's plenty of adapters around anyway. (Often first-party, e.g. both Anthropic and Google provide OpenAI-compatible APIs). There might initially have been theories that you could not easily move to a different model, creating lock-in, but in practice LLMs are so flexible and forgiving about the inputs that a different model can be just dropped in an work without any model-specific changes.

> 80% margin on GPU cost? What about after paying for power, facilities

The market price of renting that compute on the market. That's fully loaded, so would include a) pro-rated recouping the capital cost of the GPUs, b) the power, cooling, datacenter buildings, etc, c) the hosting provider's margin.

> admin, support, marketing, etc.? Are GPUs really more than half the cost of this business?

Pretty likely! In OpenAI's leaked 2024 financial plan the compute costs were like 75% of their projected costs.

2 comments

Yep, agreed, it's quite different with LLMs since the endpoints are very straightforward.

It's kind of unfair how little lock in factor there is at the base layer. Those doing the hardest, most innovative work have no way to differentiate themselves in the medium or long run. It's just unlikely that one person or company will keep making all the innovations. There is an endless stream of newcomers who will monetize on top of someone else's work. If anyone obtains a lock-in, it will not be through innovation. But TBH, it kind of mirrors the reality of the tech industry as a whole. Those who have been doing the innovation tend to have very little lock in. They are often left on the streets. In the end, what counts financially is the ability to capture eyeballs and credit cards. Innovation only provides a temporary spike.

With AI, even for a highly complex system, you'll end up using maybe 3 API endpoints; one for embeddings, one for inference and one for chat... You barely need to configure any params. The interface to LLMs is actually just human language; you can easily switch providers and take all your existing prompts, all your existing infra with you... Just change the three endpoint names, API key and a couple of params and you're done. Will take a couple of hours at most to switch providers.

> The market price of renting that compute on the market. That's fully loaded,

Sorry, I totally misread your post. Charging 80% on top of server rental isn't so bad, especially since I'm guessing there are significant markups on GPU rental given all the AI demand.