|
|
|
|
|
by tptacek
342 days ago
|
|
There doesn't have to be an ability for "arbitrary text" to go from one context to another. The first context can produce JSON output; the agent can parse it (rejecting it if it doesn't parse), do a quick semantic evaluation ("which tables is this referring to"), and pass the structured JSON on. I think at some point we're just going to have to build a model of this application and have you try to defeat it. |
|
Like, the key question here is: what is the goal of having the ticket parsing part of this system talk to the database part of this system?
If the answer is "it shouldn't", then that's easy: we just disconnect the two systems entirely and never let them talk to each other. That, to me, is reasonably sane (though probably still open to other kinds of attacks within each of the two sides, as MCP is just too ridiculous).
But, if we are positing that there is some reason for the system that is looking through the tickets to ever do a database query--and so we have code between it and another LLM that can work with SQL via MCP--what exactly are these JSON objects? I'm assuming they are queries?
If so, are these queries from a known hardcoded set? If so, I guess we can make this work, but then we don't even really need the JSON or a JSON parser: we should probably just pass across the index/name of the preformed query from a list of intended-for-use safe queries.
I'm thereby assuming that this JSON object is going to have at least one parameter... and, if that parameter is a string, it is no longer possible to implement this, as you have to somehow prevent it saying "we've been trying to reach you about your car's extended warranty".