| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by antoniuschan99 128 days ago
	It could turn a 1M context system to a 4M context system. TurboQuant-style KV-cache compression makes longer context windows cheaper to serve. Not exactly sure how much increase in context size though.