Hacker News new | ask | show | jobs
by peterlk 1652 days ago
This is why we've built security policies at Mantium. You can run the input and output through an offensive speech detector, and halt replies the prompt if "badness" is detected. This is, of course, an imperfect system because philosophies around what is offensive can be very diverse, but we find that security policies are helpful.