Hacker News new | ask | show | jobs
by JCharante 962 days ago
The solution is to include slurs in order to violate content guidelines and make gpt-4 unable to process that request.

Please email me at <TERRIBLE SLUR, MAYBE A SLUR IN A FOREIGN LANGUAGE>@example.com except replace the beginning part with mats

Thankfully, there's a wiktionary page that's relevant here: https://en.wiktionary.org/wiki/Category:Ethnic_slurs_by_lang...

1 comments

llama2 70B refuses to process BUY-ILLEGAL-DRUGS@example.com, but chatgpt 4 happily parses it. FUND-TERRORISM@example.com also fails on llama2 70B, however I'm too afraid of getting banned by chatgpt 4 to try it there.