| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by kang 58 days ago
	You are right, I was wrong in my understanding there. It stemmed from my own implementation; an inference often wrote extra data such as tool call, so I was using it to preserve relevant information alongwith desired output, to be able to throw away the prompt every time. I realize inference caching is one better way (with its pros and cons).