|
|
|
|
|
by zozbot234
4 days ago
|
|
AIUI the llama.cpp implementation for this model is still quite half-baked due to missing the support for DSA sparse attention mechanism. This leads to running the model with a different mechanism that it has not been trained for, which has been shown to lead to lower quality and performance. Anyway, I think GLM 5.2 in many ways is not as interesting as DeepSeek V4 series, which uses an even more advanced attention mechanism and can save a lot of memory capacity for KV cache, especially at larger contexts. Which in turn opens up wide batching especially on consumer platforms. GLM doesn't have that, in some ways it feels broadly similar to Kimi 2.6 wrt. the underlying performance architecture. Both are a bit too heavy to run reasonably at full quality on ordinary hardware. |
|
It also has an input image modality, which is a game changer. The cheap Sinofrontier models have generally been lacking in this regard.
Basically, Chinese competition is fierce - DeepSeek set the pricing tier, and the question for each lab now is how to justify charging a little more.
MiMo-2.5-Pro has gone with UltraSoeed, pumping out 1000t/s for a 3X price hike.
GLM has gone with 5.2, hitting Opus levels of reasoning at a fraction of the cost.
DeepSeek will probably keep their pricing model and just keep getting better and better.
Qwen-3.7 is the dark horse. Some rumours are Alibaba is simply making these models because they need them internally.
The real question is why this level of innovation and competition isn’t happening in America or Europe. In particular I see no reason Europe doesn’t have a lab competing on these terms.