|
|
|
|
|
by jplusequalt
120 days ago
|
|
>Open models like GLM 5 are very good. Even if companies decide to crank up the costs, the current open models will still be available. https://apxml.com/models/glm-5 To run GLM-5 you need access to many, many consumer grade GPUs, or multiple data center level GPUs. >They will likely get cheaper to run over time as well (better hardware). Unless they magically solve the problem of chip scarcity, I don't see this happening. VRAM is king, and to have more of it you have to pay a lot more. Let's use the RTX 3090 as an example. This card is ~6 years old now, yet it still runs you around $1.3k. If you wanted to run GLM-5 I4 quantization (the lowest listed in the link above) with a 32k context window, you would need *32 RTX 3090's*. That's $42k dollars you'd be spending on obsolete silicon. If you wanted to run this on newer hardware, you could reasonable expect to multiply that number by 2. |
|
Also, how much bang for the buck do those 3090s actually give you compared to enterprise-grade products?