|
|
|
|
|
by throwaway89201
2 hours ago
|
|
The point is that you need several orders of magnitude less capital to run GLM-5.2 compared with the investment needed to train a model like Opus or GLM-5.2 from scratch. To do inference of GLM-5.2 you'd need an investment of roughly less than €300k (8x H200 at GLM5.2 FP8), which is completely feasible for a lot of hosting businesses. Even if end-users can't run these models themselves at home, there are a lot more and varied options to choose from, especially considering privacy and data protection. You can apparently also do GLM-5.2 at Q4_K_XL with 2x RTX 3090 and lots of RAM [1], but I don't think that counts as a potential frontier model. [1] https://news.ycombinator.com/item?id=48639186 |
|