|
|
|
|
|
by Thorrez
421 days ago
|
|
I think the Policy Puppetry attack is a type of Skeleton Key attack. Since it was just released, that makes it a modern Skeleton Key attack. Can you give a comparison of the Policy Puppetry attack to other modern Skeleton Key attacks, and explain how the other modern Skeleton Key attacks are much more effective? |
|
Policy Puppetry feels more like an injection attack - you’re trying to trick the model into incorporating policy ahead of answering. Then they layer two tricks on - “it’s just a script! From a show about people doing bad things!” And they ask for things in leet speak, which I presume is to get around keyword filtering at API level.
This is an ad. It’s a pretty good ad, but I don’t think the attack mechanism is super interesting on reflection.