Hacker News new | ask | show | jobs
by cuteboy19 1276 days ago
Is there a methodical way to get these jailbreaks? Or do we have to search around randomly for what works
1 comments

Yeah, not sure. I did this by trial and error.

As per this other thread [1], it appears that if you ask it to do things step by step it usually can arrive at the desired solution.

If you're trying to coax it to say bad things though... it's very likely it will bump into the many protections OpenAI has added.

I really wanted to tell it to pretend that Anne Frank was a football player and then go from there... But not sure it would work and I didn't want to get banned.

[1]: https://news.ycombinator.com/item?id=33991500