Hacker News new | ask | show | jobs
by naaqq 51 days ago
Sorry, I was wrong here. I meant a single long session. And there’s no compression, the 1M context is only half used.
1 comments

Then where did 200M come from? 200,000 tokens?
Not all read tokens are included in the context, many of the tokens are from read cache hits. I hit it many times so it grew to 200M. The number came from the API platform.