That is not what i'm seeing. I've been coding intensively with claude code for the last 3 months: 200k lines of go, 1200+ commits, mostly using opus. I don't think i could have done this with a local LLM. Maybe on a M5 pro?
Qwen 3.5 122b is competitive with Opus 4.6, and runs at 35t/s on a Strix Halo. It is my daily driver.
Unlike Opus I can run abliterated models with censorship removed so it can be used for security research and reverse engineering and whatever I want with privacy, offline.
Unlike Opus I can run abliterated models with censorship removed so it can be used for security research and reverse engineering and whatever I want with privacy, offline.
It makes any hosted models feel like a kids toy.