Hacker News new | ask | show | jobs
by pj_mukh 62 days ago
have you found a model that does this with usable speeds on an M2/M3?
1 comments

On a M4 MBP ollama's qwen3.5:35b-a3b-coding-nvfp4 runs incredibly fast when in the claude/codex harness. M2/M3 should be similar.

It's incomparably faster than any other model (i.e. it's actually usable without cope). Caching makes a huge difference.