| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by YetAnotherNick 843 days ago
	My calculation of kv cache gives 1GB per 3000 tokens for fp16. I am surprised openAI competitors haven't done this. This kind of features have not so niche uses, where prefix data could be cached.