You are fighting nondeterministic behavior with more nondeterministic behavior, or in other words, fighting probability with probability. That doesn't necessarily make things any better.
In my experience, an agent with "fresh eyes", i.e., without the context of being told what to write and writing it, does have a different perspective and is able to be more critical. Chatbots tend to take the entire previous conversational history as a sort of canonical truth, so removing it seems to get rid of any bias the agent has towards the decisions that were made while writing the code.
I know I'm psychologizing the agent. I can't explain it in a different way.
I think of it as they are additive biased. ie "dont think about the pink elephant ". Not only does this not help llms avoid pink elphants instead it guarantees that pink elephant information is now being considered in its inference when it was not before.
I fear thinking about problem solving in this manner to make llms work is damaging to critical thinking skills.
Assigning different agents to have different focuses has worked for me. Especially when you task a code reviewer agent with the goal of critically examining the code. The results will normally be much better than asking the coder agent who will assure you it's "fully tested and production ready"
I know I'm psychologizing the agent. I can't explain it in a different way.