Hacker News new | ask | show | jobs
by simonw 376 days ago
That doesn't work either. It's always possible to come up with an attack which subverts the "moderator" model first.

Using non-deterministic AI to protect against attacks against non-deterministic AI is a bad approach.

1 comments

So you just need another agent to review the data being passed to the protector agent. Easy-peasy.

Use my openAI referral code #LETITRAIN for 10% off!