|
|
|
|
|
by deadbabe
538 days ago
|
|
I tried it on friend.com. It worked a for a while, I got the character to convince itself it had been replaced entirely by a demon from hell (because it kept talking about the darkness in their mind and I pushed them to the edge). They even took on an entire new name. For quite a while it worked, then suddenly in one of the responses it snapped out of it, and assured me we were just roleplaying no matter how much I tried to go back to the previous state. So in these cases where you think you’ve jailbroken an LLM, is it really jailbroken or is it just playing around with you, and how do you know for sure? |
|
With a LLM, I don't think that there is a difference.