|
i’m running m4 pro 48gb right now omlx + gemma 12b 6 bit + pi it’s feasible for sure MoEs for speed (qwen 35b, cohere 30b, gemma 26b) Dense for more methodical work (qwen 27b [reigning champ], gemma 31b, gemma 12b) MoE i recommend 5bit+ Dense i think 4 bit is okay Play with your context size, you don’t really need that much, have lazy loading for tools and mcps my pi extensions for anyone looking for a skinny quick setup, i have use `--no-skills` right now too: "npm:pi-codex-goal",
"npm:pi-simplify",
"npm:pi-mcp-adapter",
"git:github.com/elpapi42/pi-minimal-subagent",
"npm:@wierdbytes/pi-statusline",
"npm:@aliou/pi-guardrails",
"npm:pi-lens",
"npm:@juicesharp/rpiv-todo",
"npm:pi-hashline-readmap",
"npm:@mrclrchtr/supi-review",
"npm:pi-cmux",
"npm:@mrclrchtr/supi-context",
"npm:pi-tool-search"
think of local models as "zero sugar" models and that's where we're at right now. I think it's crazy how good these models are compared to last year's frontier models |