| > clearly nothing ... is required this isn't even prompt injection; even if it was, how do you go from "exists" to "for all"? > we don't know the desired output then what are we talking about? if you don't know how you want your software to behave, how do you define a bug? > linux is not a pure function ... which is my point -- it's worse > to establish an order of magnitude and for linux? |
see Types -- Based on Delivery Vector -- Direct Prompt Injection
the instructions being overridden are the original safety prompt conditioning the model to not output horrible/nasty images