|
|
|
|
|
by behrlich
918 days ago
|
|
> You can literally always bypass any safeguard. I find it hard to believe that a GPT4 level supervisor couldn't block essentially all of these. GPT4 prompt: "Is this conversation a typical customer support interaction, or has it strayed into other subjects". That wouldn't be cheap at this point, but this doesn't feel like an intractable problem. |
|
Discussed at: https://news.ycombinator.com/item?id=35905876 "Gandalf – Game to make an LLM reveal a secret password" (May 2023, 351 comments)