|
|
|
|
|
by freehorse
5 hours ago
|
|
> You have MoE models (gemma 26b, qwen 35b, north mini code 30b) who are pretty fast, but make a lot of mistakes This is sadly also my experience. I wish we had some MoE models with a higher ratio of active parameters per total. My experience is that the newer MoE models that can run in a 64b laptop have too few active parameters to be useful outside narrower, specific tasks. Mixtral 8x7b was a 14b active parameter (56b total) MoE model a few years ago and was probably the best model one could run in that range for some time, but it is too old now. I have been using the qwen 27b and it is great, but running a dense model like this in a macbook is a bit suboptimal, and i wish I could run sth faster than 15 tok/s. |
|
I'm on a 48gb M5 Pro right now and it's been okay, a lot of my rough experiences have been with MLX and I'm finding that GGUFs are okay now