Hacker News new | ask | show | jobs
by himata4113 9 days ago
Anthropic and openai has the most efficient tokens per unit of compute on the planet and honestly that's their current moat. They're able to serve tokens at half the cost of any opensource provider. Here's the costs to serve opus 4.7 in china on aws according to one of my connections that operates an enterprise account in the region:

  Input: $0.257
  Output: $1.286
  Cache read: $0.0257
  Cache write: $0.322
And I have zero doubts that using batching and other optimizations that subscription users are being served at an even lower cost. Most of their expenses likely come from training as we're far into the diminishing returns terriority. We will know once anthropic is required by law to report these numbers so there's no point in continued speculation that "anthropic is losing $9 for every $1" because 1: unless there's some subsidies going on it's not true and 2: we will be told directly from anthropic what the numbers are in the near future.
1 comments

My understanding is there's local API resellers who provide gateway access to bundle claude/openai with other cheaper PRC models via routing to water down price. The resellers are bulk generating pro accounts / trials i.e. basically 100% subsidized by Anthropic/OpenAI. Resellers also sell for cost or below cost because they're intercepting training data to resell. The economics of PRC token is divorced from Anthropic and OpenAi, i.e. PRC gray market tokens are most "efficient" for shadow trial resellers (who basically pays for disposable sign up sim) and least efficient (as in negative sum) for providers who convert none of the subsidized trial accounts into paying.
Not the same thing, these are actually gateways into opus models hosted on hardware at AWS in china. https://www.amazonaws.cn/en/about-aws/china/

Also abuse of free accounts/trials wouldn't work since it would destroy cache and it maintains 97% cache rate.