|
|
|
|
|
by Karupan
533 days ago
|
|
They are perfectly fine for certain people. I can run Qwen-2.5-coder 14B on my M2 Max MacBook Pro with 32gb at ~16 tok/sec. At least in my circle, people are budget conscious and would prefer using existing devices rather than pay for subscriptions where possible. And we know why they won't ship NVLink anymore on prosumer GPUs: they control almost the entire segment and why give more away for free? Good for the company and investors, bad for us consumers. |
|
Qwen 2.5 32B on openrouter is $0.16/million output tokens. At your 16 tokens per second, 1 million tokens is 17 continuous hours of output.
Openrouter will charge you 16 cents for that.
I think you may want to reevaluate which is the real budget choice here
Edit: elaborating, that extra 16GB ram on the Mac to hold the Qwen model costs $400, or equivalently 1770 days of continuous output. All assuming electricity is free