|
|
|
|
|
by wizee
245 days ago
|
|
153 GB/s is not bad at all for a base model; the Nvidia DGX Spark has only 273 GB/s memory bandwidth despite being billed as a desktop "AI supercomputer". Models like Qwen 3 30B-A3B and GPT-OSS 20B, both quite decent, should be able to run at 30+ tokens/sec at typical (4-bit) quantizations. |
|
Neither product actually qualifies for the task IMO, and that doesn't change just because two companies advertised them as such instead of just one. The absolute highest end Apple Silicon variants tend to be a bit more reasonable, but the price advantage goes out the window too.