|
|
|
|
|
by kingstnap
24 days ago
|
|
Anthropic's caching requires you to pay a $0.75/Mtok for Sonnet and $1.25/MTok for Opus as a surcharge on top of the original input token cost. It's not even automatic. If you are reading ~8 times (8 total back and forth tool calls) that means that cache reads in some sense cost ~$0.4 / M toks (Amortizing the write surcharge over all reads). It's really quite ridiculously expensive considering what you are paying for is some residence on a VRAM that sometimes gets offloaded to NVMe. |
|