I'm getting ~30 tok/s on the A3B model with my 3070 Ti and 32k context.
> Do you feel you could replace the frontier models with it for everyday coding? Would/will you?
Probably not yet, but it's really good at composing shell commands. For scripting or one-liner generation, the A3B is really good. The web development skills are markedly better than Qwen's prior models in this parameter range, too.
I’m trying to use local models whenever possible. Still need to lean on the frontier models sometimes.