So far we have been native harnessmaxxing, which simplifies things a lot.
The configuration space around open models is much larger. Eg which models, capability heterogeneity, which harness, networking, data egress / privacy, etc.
If anyone is getting very good production code out of open models, I'd love to do a user interview to better understand your setup. Email is in my bio.
With how much vendor harnesses are now actively steering the agent with their own instructions on top of user prompts, I think it’d be super interesting to see a comparison of one of the already tested models - so Opus 4.7 or GPT-5.5 - across a range of different harnesses that aren’t their native. OpenCode, Pi, Hermes, Kilo Code. The most popular coding-focused harnesses, basically.
So far we have been native harnessmaxxing, which simplifies things a lot.
The configuration space around open models is much larger. Eg which models, capability heterogeneity, which harness, networking, data egress / privacy, etc.
If anyone is getting very good production code out of open models, I'd love to do a user interview to better understand your setup. Email is in my bio.