|
|
|
|
|
by Galanwe
46 days ago
|
|
Yes it can, but the experience is not great. A single M3 maxed can run a Q2 Kimi 2.6, though thats with a hardly degraded perplexity. 2x M3s with RDMA can run a lossless Kimi2.6 at Q4, but with CPU only you would get okayish decode but horrible (+1m) TTFT, that wouldnt be a great _interactive_ experience. |
|