|
|
|
|
|
by brulard
85 days ago
|
|
Did I? Not only are you comparing apples to oranges, you even provide misleading numbers. 3090 gets 20-30 tokens a second for dense ~30B models (QwQ 32B, Gemma 3 27B Q4), similar to M3 ultra.
If you are talking about Qwen3-Coder 30B (MoE), then both 3090 and M3 Ultra are around ~70 tok/s. But even if you were right about the speed - which you are not - speed is pointless if you need large model that wouldn't fit into your VRAM. |
|