|
|
|
|
|
by zyx321
311 days ago
|
|
There's been some theories floating around that the 128gb version could be the best value for on-premise LLM inference. The RAM is split between CPU and GPU at a user-configurable ratio. So this might be the holy grail of "good enough GPU" and "over 100GB of VRAM" if the rest of the system can keep up. |
|
I believe the fixed split thing is a historical remnant. These days, the OS can allocate memory for the GPU to use on the fly.