Hacker News new | ask | show | jobs
by solidasparagus 86 days ago
What do you mean? Costs spiked with the introduction of the 1M context window I believe due to larger average cached input tokens, which dominate cost.
1 comments

Nah, there's apparently a few caching bugs, one --resume and some noisy tool use. I have a little app that monitors and resets the context window at 70% usage based on 200k tokens and I'm about to run out of weekly allowance after just a couple days. Never happened before