The kv-cache is the internal LLM state after having processed the tokens. It's big, and you do not have it locally.
Yes - generated from the data of the conversation.
Read what I said again. I'm explaining how they regenerate the cache by running the conversation though the LLM to reconstruct the KV cache state.
Yes - generated from the data of the conversation.
Read what I said again. I'm explaining how they regenerate the cache by running the conversation though the LLM to reconstruct the KV cache state.