| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by sibidharan 64 days ago
	I hit rate-limit every other day... 5 hour... Week... I consume my 20x weekly in 3 days! So having 2x 20x! If I could harness $10000 worth of API usage... this is the best time, no idea how long we will get this subsidy! I wont pay $10000 out of my pockets to do the same work!

1 comments

Hiteshjain118 64 days ago

I'm curious what your token/$ usage looks like, if you'd be willing to share :)

link

sibidharan 64 days ago

TL;DR = ~$22,720 total compute @ Opus 4.7 if no caching = $113,418 (5.9x more) - this is just one month on one server... I have 3 more servers like this where I work all time!

// Generated wit Claude

Ran this on my own ~/.claude/projects/ (933 sessions, 93,842 model calls, mix of main thread + spawned subagents). Numbers came out very close to yours in shape, different in scale.

cost.py (Opus 4.7 list rates, main thread only):

  cache reads (re-reading context)   21.69B tok   $10,843   56%
  cache writes (1h)                     678M tok   $6,781   35%
  output (incl. reasoning)             63.5M tok   $1,589    8%
  fresh uncached input                  1.6M tok       $8    0%
  TOTAL                              22.43B tok   $19,221

  if no caching: $113,418 (5.9x more)
  input:output ratio: 353:1
  cache hit rate: 97.0%

token_time_breakdown.py (179M unique tokens, 166h wall clock):

  reasoning (hidden thinking)   29% of tokens,  102h (61%) of time
  bash                           1.4%           23h (14%)
  writing tool calls             4%             14h  (8%)
  summaries                      2.5%            9h  (5%)
  reading/searching/web          1.6%            7h  (4%)
  subagents                      0.2%            6h  (4%)
  editing                        0.1%            5h  (3%)
  pasted attachments             25.3%           -
  typed prompt                   34.4%           -
  system+tools                   1.4%            -

reread_breakdown.py (per-activity share of billed input):

  reasoning           59.5%   (~$11.4k of the bill is re-sending old
                               hidden thinking back to the model)
  attachments         22.6%
  tool calls           7.8%
  bash                 3.0%
  reading/web          2.4%
  my prompt            1.6%
  summaries            1.5%
  system+tools         0.8%
  subagents            0.4%

main_vs_sidecar.py:

                          main         sidecar      combined
  sessions/agents          449         484           933
  assistant calls       63,820      30,022        93,842
  cache hit             97.0%        94.4%         96.6%
  turns/agent             142 (median 20, max 11,058 in one session)
                                       62 (median 44)
  reasoning % of out    82%          51%           77%
  cost @ Opus 4.7    $19,225       $3,495       $22,720

  sidecar = 32% of calls but only 15% of cost. Subagents are doing
  their job (cheap, focused, short context).

Same shape as yours: re-read dominates, reasoning is the biggest re-read line, caching is the only thing keeping it sane. The one that surprised me was a single 11,058-turn main session - some autonomous loop I forgot to kill. Going to grep for that.

Repo: github.com/Coral-Bricks-AI/coral-ai/tree/main/claude-code-token-xray

link

Hiteshjain118 63 days ago

Thanks for sharing your workload. Impressive! And this one person(you) steering all this token usage across the month? I want to double click on -- if the prices were to go high, you wouldn't be spending these many tokens. Would you just delay all those projects?

link

sibidharan 62 days ago

Its just me steering all tokens... Multiple projects for my bootstrapped Academy, running parallel researches on things I wouldn't do myself, and deriving the right patterns for the future. Using different Max on same machine. I have similar ~$22K on another machine, where I am working in parallel with the same 2xMax subscriptions!

So if the prices were to go high, I honestly have no idea at the moment. I might need a rehabilitation centre!!!

But I am mentally prepared that someday this will be gone... So make hay while the sun shines! I am doing the most complex of all works I want to do... And pushing the limits. Knowing how much it costs in actual tokens, I give me some kind of seriousness, like an invisible funding that I must use to grow!! Considering this like a launchpad... If any of the projects hit and scale and I get some funding in future, then I might not need to worry when prices go high. So either this or rehab lol !

link

Hiteshjain118 62 days ago

More power to you!

link