| HN Mirror

> The kv-cache is the internal LLM state after having processed the tokens. It's big, and you do not have it locally.

Yes - generated from the data of the conversation.

Read what I said again. I'm explaining how they regenerate the cache by running the conversation though the LLM to reconstruct the KV cache state.