| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by mordae 24 days ago

Not OP, but I routinely load 150k tokens into context. A full sub-package to work on, select other files in the monorepo, e.g. front-end visualization and back-end data loader. Then work some 150k tokens, then start again.

At the end, cache hit rate is like 99.5% if Novita is not having issues.

For official DeepSeek API, 99.9% or something.

Custom harness that never compacts or otherwise doctors the history.

1 comments

ctxc 24 days ago

Those numbers make sense to me...120 million input tokens is like 120 sessions of hitting the full context limit, which seems like a lot to me though

link