| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by vanviegen 156 days ago
	> Furthermore, all of the major LLM APIs reward you for re-sending the same context with only appended data in the form of lower token costs (caching). There's a little more flexibility than that. You can strip of some trailing context before appending some new context. This allows you to keep the 'long-term context' minimal, while still making good use of the cache.