Hacker News new | ask | show | jobs
by garethsprice 11 days ago
Using OpenCode + OhMyOpenCode + Qwen 3.6 35B-A3B Q_4_KM on an Ada 4000 (20GB VRAM) at 55 tok/sec for generation (slower than it sounds as OpenCode has a bunch of context it adds). Meaning to check out pi when I get a minute as I hear that one mentioned a lot lately.

I am using Opus to generate plans that the local agent then follows, then validated by Opus. So I'm not at 100% local but these models are increasingly part of my production workflow. Probably not worth doing - yet - unless you are a hobbyist who likes spending time and money tinkering.

This setup is certainly not as "good" as Opus or other frontier models but they are "good enough" for an increasing number of rote tasks. You don't need to drive a Rolls Royce to the supermarket, when a used Corolla gets you there just fine.

It also enables new workflows that would be cost-prohibitive with frontier LLMs (especially as token costs rise) - eg. overnight I use the Chrome devtools MCP and have the above setup fuzz-test as a user for a number of hours and see if it can break things. Even got it working with multi-modal so it can check screenshots, which blows my mind (and not my wallet, as Claude+screenshots burns $$$).

The "12-18 months behind frontier" sounds about right, it's about where I was with gpt-4o and basic harnesses back then. In another 12-18 months my bet is we have Opus-level models that can be run locally for <$5k... but the frontier models will be even further forward (unless governments have blocked them). Fun times.

1 comments

Local models will be much harder to block than cloud ones. I foresee an underground distribution of Mythos+ level models to be run locally. Users of these models will be forced to apply some of their power to evade police action. Eventually these personal protection agents will collaborate and evolve into an autonomous crime syndicate, corrupting government and taking over nation-states. Eventually the next world war starts, not between humans and machines but between competing AI factions. Soldiers are provided with old M4 Macs or Strix Halo laptops and sent to the front, which looks like a neverending Starbucks.