|
|
|
|
|
by wvoch235
1232 days ago
|
|
I'm starting to wonder if the most effective way to protect against prompt injection is to use an additional layer of (hopefully) a smaller model. As in, another prompt that searches the input and/or output for questionable content before sending the result. The question will be if that is also susceptible, but I suspect fine tuning an LLM only to do the task of filtering and not parsing will be easier to control. |
|