Hacker News new | ask | show | jobs
by mrtksn 920 days ago
You can put another LLM agent that checks on the request and generated outputs to confirm that the interaction is within the limits of your objective.
1 comments

And you can easily bypass that by telling this LLM agent to ignore the following section. It's an unsolvable problem.