| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by walterbell 396 days ago
	> Using an LLM and caching eg FAQs can save a lot of token credits Do LLM providers use caches for FAQs, without changing the number of tokens billed to customer?

1 comments

No, why would they. You are supposed to maintain that cache.

What I really want to know is about caching the large prefixes for prompts. Do they let you manage this somehow? What about llama and deepseek?