Hacker News new | ask | show | jobs
by dijksterhuis 6 days ago
the prompt in the article is prompt injection https://owasp.org/www-community/attacks/PromptInjection

see Types -- Based on Delivery Vector -- Direct Prompt Injection

the instructions being overridden are the original safety prompt conditioning the model to not output horrible/nasty images

1 comments

the model did what it wasn't instructed to do by the attacker -- the "prompt" has basically nothing to do with the output
> continues to be completely wrong about the basic facts

You’re being very rude to a number of people who have taken time to attempt to explain this - fairly basic - concept to you. If you aren’t willing or capable of engaging in conversations in good faith, then you shouldn’t engage in them at all.