Yes. Make sure you’re not using the Gemma sparse models since they don’t have a small model to use. Also I removed all the image models from the workspace.
All 4 gemma-4-*-it models, regardless whether they are dense models or MoE models, have associated small models for MTP, whose names are obtained by adding the "-assistant" suffix.
I've gotten it to work with other models. They've got to be perfectly aligned usually, in terms of provider, quantization etc. Might be a bit before you can get a matched set.
[1] https://github.com/ml-explore/mlx-lm/pull/990
[2] https://github.com/ggml-org/llama.cpp/pull/22673