|
|
|
|
|
by stared
3 hours ago
|
|
I really recommend Qwen3.6 27B. Make some tests, and its 8 bit version runs at 30tok/s when using llama.cpp with MTP and run on Macbook Max M5. I have 128 GB, but but 64 GB is well enough.
https://github.com/stared/benching-local-llms-on-apple-silic... When using benchmarks, it gives more-or-less the level of SotA mid-late 2025. |
|