| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by ghm2199 132 days ago
	And Incidentally prefill would also be how caching,say, a system prompt saves you some $ for API usage with LLM providers. They only compute the kv cache for the new tokens after the system prompt.