To a point. If gpt5 takes 3 minutes to output and qwen3 does it in 10 seconds and the agent can iterate 5 times to finish before gpt5, why do I care if gpt5 one shot it and qwen took 5 iterations
Often all it takes is to reset to a checkpoint or undo and adjust the prompt a bit with additional context and even dumber models can get things right.
I've used grok code fast plenty this week alongside gpt 5 when I need to pull out the big guns and it's refreshing using a fast model for smaller changes or for tasks that are tedious but repetitive during things like refactoring.
Yes fast/dumb models are useful! But that's not what OP said - they said they can be as useful as the large models by iterating them.
Do you use them successfully in cases where you just had to re-run them 5 times to get a good answer, and was that a better experience than going straight to GPT 5?
Not everyone is solving complicated things every time they hit cmd-k in Cursor or use autocomplete, and they can easily switch to a different model when working harder stuff out via longer form chat.