| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by yunwal 78 days ago

Is anyone pretending like models are not vulnerable to prompt injection? My understanding was that Anthropic has been pretty open about admitting this and saying "give access to important stuff at your own risk".

https://www.anthropic.com/research/prompt-injection-defenses

Now, do I think that they sometimes encourage people to use Claude in dangerous ways despite this? Yeah, but it's not like this is news to anyone. I wouldn't consider this jailbreaking, this is just how LLMs work.