https://gandalf.lakera.ai/baseline
This thing models exactly these scenarios and asks you to break it, its still pretty easy. LLMs are not safe.