|
|
|
|
|
by mordae
24 days ago
|
|
Not OP, but I routinely load 150k tokens into context. A full sub-package to work on, select other files in the monorepo, e.g. front-end visualization and back-end data loader. Then work some 150k tokens, then start again. At the end, cache hit rate is like 99.5% if Novita is not having issues. For official DeepSeek API, 99.9% or something. Custom harness that never compacts or otherwise doctors the history. |
|