| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by SEGyges 376 days ago
	Every LLM provider caches their KV-cache, it's a publicly documented technique (go stuff that KV in redis after each request, basically) and a good engineering team could set it up in a month.

1 comments

Are you saying if I ask a prompt "foo" and then a month later another user asks "foo" then it retrieves a cached value?

No, the key value cache is the context in a way the model can read it.