|
|
|
|
|
by visarga
110 days ago
|
|
It's our job after all to keep the agent aligned, we should not expect it to self recover when it goes astray or mind its own alignment. Even with humans we hire managers to align the activity of subordinates, keeping intent and work in sync. That said, I find that running judge agents on plans before working and on completed work helps a lot, the judge should start with fresh context to avoid biasing. And here is where having good docs comes in handy, because the judge must know intent not just study the code itself. If your docs encode both work and intent, and you judge work by it, then misalignment is much reduced. My ideal setup has - a planning agent, followed by judge agent, then worker, then code review - and me nudging and directing the whole process on top. Multiple perspectives intersect, each agent has its own context, and I have my own, that helps cover each other's blind spots. |
|
We do this socially too. From a very young age, children teach each other what they like and don't like, and in that way mutually align their behaviour toward pro social play.
> I find that running judge agents on plans before working and on completed work helps a lot
How do you set this up? Do you do this on top of the claude code CLI somehow, or do you have your own custom agent environment with these sort of interactions set up?