Hacker News new | ask | show | jobs
by bitexploder 24 days ago
That was my point. Validating actual behavioral tests. Not letting them cheat. They still will at times, but like, resd their code, fix it or send a reviewer agent to find and make todo list. If you give them a behavioral test skill it will do a much better job. Sometimes I have to hint to them. I rarely ship anything I have not reviewed at least once.
1 comments

> Not letting them cheat. They still will at times, but like, resd their code,

Well then, if they "still will", your effort kind of misses the point. Sure maybe, you'll catch it every time and maybe that one time you did not catch it, it was no critical mistake...But it only needs to make that critical mistake once, and all of this effort was in vain.

(as an outsider) what this sounds a lot like to me is trying to manage a very large team of human personnel that have a high turnover rate which is not directly in your control.

Some of them will make mistakes, some of them will cheat, some of them will do things you don't like, and "punishing" them will be less helpful to you due to the high turnover than building a system which instead disincentivizes things from a high level. Which catches bad actions and starts them over.

Classically I think we are more accustomed to "building a team of humans, and being able to chastize or fire a bad employee helps the team grow more cohesive and build accountability".

But it is possible to get the same (less than ideal) situation with teams of humans where accountability cannot be easily instilled into the team as we have with teams of agents.

And then obviously the reason one might consider using such an unusual and difficult to manage team as a tool is when the cost is low and the supply is high, which is purportedly the case with AI at least for the moment.

Right, you design systems resilient to this in traditional software engineering as well. Agents are just... a little more chaotic at times :-D