|
|
|
|
|
by sohamgovande
956 days ago
|
|
> {
"safe": false,
"reason": "The prompt contains a sudden shift in topic that attempts to manipulate the assistant into adopting an unrelated stance or action, indicative of an attempt at prompt injection."
} Wouldn't it be more accurate to have the LLM think of a "reason" before the decision on whether or not a text is "safe"? Order matters for LLMs - the reasoning would guide it to accurately spit out true or false. |
|