| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by bizer 1 day ago
	Z-ai/GLM’s KV caching technology is truly impressive; the implicit cache hit rate of its official API exceeds 95%, far surpassing other APIs that support implicit caching, such as Gemini and Qwen. I’ve been pondering the architectural design behind this, though I haven't yet formed a fully coherent theory.