| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by csomar 340 days ago
	My understanding is that caching reduce computation but the whole input is still processed. I don’t think is fully disclosing how their cache works. LLMs degrade with long input regardless of caching.