| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by GaggiX 49 days ago
	Not to be confused with Flash Attention. What's novel here is the extremely small KV cache memory usage per long context windows, like 0.77GB with 512K, a 90% memory usage reduction compare to the already really small KV cache memory usage of Deepseek V4 Flash.