Hacker News new | ask | show | jobs
by kleton 93 days ago
I proxy all of my llm completion subscriptions. In a typical 7d span-

model completions read write cached_read cache_write

claude-opus-4-6 11000 16900000 5840000 1312000000 66120000

1 comments

17M uncached reads (input) and 6M of uncached writes (output) is

  $5x17+$25x6=$235 for Opus 4.6

  $2x17+$12x6=$106 for Gemini 3 Pro

  $0.60x17+$3.6x6=$31.80 for Qwen3.5 397B-A17B via Huggingface API
You did not add up cache writes, which are $6.25 / MTok, which is another ~$400