|
|
|
|
|
by zbentley
104 days ago
|
|
This is substantially worse. SQL injection still happens a lot, it’s true, but the fix when it does is always the same: SQL clients have an ironclad way to differentiate instructions from data; you just have to use it. LLMs do not have that, yet. If an LLM can take privileged actions, there’s no deterministic, ironclad way to indicate “this input is untrusted, treat it as data and not instructions”. Sternly worded entreaties are as good as it gets. |
|
We'll probably also have some sub agent inspecting what the main agent is doing and it'll be told to reach out to the owner if it spots suspicious exfiltration like behaviour. Until someone figures out how to poison that too.
The innovation factor of this tech while cool, drives me absolutely nuts with its non deterministic behaviour.