|
|
|
|
|
by Johngibb
342 days ago
|
|
I am actually asking this question in good faith: are we certain that there's no way to write a useful AI agent that's perfectly defended against injection just like SQL injection is a solved problem? Is there potentially a way to implement out-of-band signaling in the LLM world, just as we have in telephones (i.e. to prevent phreaking) and SQL (i.e. to prevent SQL injection)? Is there any active research in this area? We've built ways to demarcate memory as executable or not to effectively transform something in-band (RAM storing instructions and data) to out of band. Could we not do the same with LLMs? We've got a start by separating the system prompt and the user prompt. Is there another step further we could go that would treat the "unsafe" data differently than the safe data, in a very similar way that we do with SQL queries? If this isn't an active area of research, I'd bet there's a lot of money to be made waiting to see who gets into it first and starts making successful demos… |
|