Hacker News new | ask | show | jobs
by MyNameIs_Hacker 740 days ago
I've advocated at work for a similar strategy using prompt injections and jailbreaks in the dataset, and to abort when those documents are matched. So far no traction. I think overall it is a mistake to build any such system with only positive examples or documents, but I'm a security person, and still learning machine learning.