| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by ojo-rojo 148 days ago
	How about a subsequent review where a separate agent analyzes the original issue and resultant code and approves it if the code meets the intent of the issue. The principle being to keep an eye out for manual work that you can describe well enough to offload. Depending on your success rate with agents, you can have one that validates multiple criteria or separate agents for different review criteria.

2 comments

g947o 148 days ago

You are fighting nondeterministic behavior with more nondeterministic behavior, or in other words, fighting probability with probability. That doesn't necessarily make things any better.

link

pyridines 148 days ago

In my experience, an agent with "fresh eyes", i.e., without the context of being told what to write and writing it, does have a different perspective and is able to be more critical. Chatbots tend to take the entire previous conversational history as a sort of canonical truth, so removing it seems to get rid of any bias the agent has towards the decisions that were made while writing the code.

I know I'm psychologizing the agent. I can't explain it in a different way.

link

citizenpaul 148 days ago

I think of it as they are additive biased. ie "dont think about the pink elephant ". Not only does this not help llms avoid pink elphants instead it guarantees that pink elephant information is now being considered in its inference when it was not before.

I fear thinking about problem solving in this manner to make llms work is damaging to critical thinking skills.

link

Foobar8568 148 days ago

Fresh eyes, some contexts and another LLM.

The problem is information fatigue from all the agents+code itself.

link

hex4def6 148 days ago

Aren't human coders also nondeterministic?

Assigning different agents to have different focuses has worked for me. Especially when you task a code reviewer agent with the goal of critically examining the code. The results will normally be much better than asking the coder agent who will assure you it's "fully tested and production ready"

link

samrus 147 days ago

Human coders are far more reliable. The only downside is speed, and therefore cost

link

tbossanova 148 days ago

Probably true

(Sorry.)

link

samrus 147 days ago

Slop on slop. Who watches rhe watchman?

link