Hacker News new | ask | show | jobs
by spwa4 238 days ago
Am I reading this right? I was expecting much more performance. My 64G M1 Max has 40.72 tok/s on ollama/GPT-OSS-20B (less than half the price of this machine), and M4 Max 128G from a colleague (but 32G would work) gets about 67 tok/s on ollama/GPT-OSS-20B, and apparently the most recent software updates push that to 78 tok/s. The DGX Spark gets 82.74 tok/s.

Ryzen Max 395+ gets you 55 tok/s [1]

[1] https://www.reddit.com/r/LocalLLaMA/comments/1nabcek/anyone_...