| It's addressed poorly. > First, there's not that much motive to gain API market share with unsustainably cheap prices. Any gains would be temporary, since there's no long-term lock-in, What? If someone builds something on top of your API, they're tying themselves to it, and you can slowly raise prices while keeping each increase well below the switching cost. > Second, some of those models have been released with open weights and API access is also available from third-party providers who would have no motive to subsidize inference. See above. Just like any other Cloud service, you tie clients to your API. > Third, Deepseek released actual numbers on their inference efficiency in February. Those numbers suggest that their normal R1 API pricing has about 80% margins when considering the GPU costs, though not any other serving costs. 80% margin on GPU cost? What about after paying for power, facilities, admin, support, marketing, etc.? Are GPUs really more than half the cost of this business? (EDIT: This is 80% margin on top of GPU rental, i.e. total compute cost. My bad.) Guessing about costs based on prices makes no sense at this point. OpenAI's $20/mo and $200/mo tiers have nothing to do with the cost of those services -- they're just testing price points. |
That's not really how the LLM API market works. The interfaces themselves are pretty trivial and have no real lock-in value, and there's plenty of adapters around anyway. (Often first-party, e.g. both Anthropic and Google provide OpenAI-compatible APIs). There might initially have been theories that you could not easily move to a different model, creating lock-in, but in practice LLMs are so flexible and forgiving about the inputs that a different model can be just dropped in an work without any model-specific changes.
> 80% margin on GPU cost? What about after paying for power, facilities
The market price of renting that compute on the market. That's fully loaded, so would include a) pro-rated recouping the capital cost of the GPUs, b) the power, cooling, datacenter buildings, etc, c) the hosting provider's margin.
> admin, support, marketing, etc.? Are GPUs really more than half the cost of this business?
Pretty likely! In OpenAI's leaked 2024 financial plan the compute costs were like 75% of their projected costs.