|
|
|
|
|
by coder543
173 days ago
|
|
$10k gets you a Mac Studio with 512GB of RAM, which definitely can run GLM-4.7 with normal, production-grade levels of quantization (in contrast to the extreme quantization that some people talk about). The point in this thread is that it would likely be too slow due to prompt processing. (M5 Ultra might fix this with the GPU's new neural accelerators.) |
|
Please do give that a try and report back the prefill and decode speed. Unfortunately, I think again that what I wrote earlier will apply:
> In practice, it'll be incredible slow and you'll quickly regret spending that much money on it
I'd rather place that 10K on a RTX Pro 6000 if I was choosing between them.