If this is actually competitive with Gemini 2.5 Pro that would be insane esp for an Apache2 truly open weights model, let's hope it's not too hacked to shine on benchmarks!
Qwen3 models are solid and at such a low cost, it doesn’t hurt to pair it with something like Sonnet 4 as a check. I mean it does eliminate a lot of Claude’s “You’re absolutely right!” loops.