|
|
|
|
|
by bizer
1 day ago
|
|
Z-ai/GLM’s KV caching technology is truly impressive; the implicit cache hit rate of its official API exceeds 95%, far surpassing other APIs that support implicit caching, such as Gemini and Qwen. I’ve been pondering the architectural design behind this, though I haven't yet formed a fully coherent theory. |
|