|
|
|
|
|
by omneity
421 days ago
|
|
I'm using Qwen3-30B-A3B locally and it's very impressive. Feels like the GPT-4 killer we were waiting for for two years. I'm getting 70 tok/s on an M3 Max, which is pushing it into the "very usable" quadrant. What was even more impressive is the 0.6B model which made the sub 1B actually useful for non-trivial tasks. Overall very impressed. I am evaluating how it can integrate with my current setup and will probably report somewhere about that. |
|
Which I find even more impressive, considering the 3060 is the most used GPU (on Steam) and that M4 Air and future SoCs are/will be commonplace too.
(Q4_K_M with filesize=18GB)