|
|
|
|
|
by tiny-automates
132 days ago
|
|
agree that this is a protocol-level issue, not framework-specific. but the "all external tool calls require confirmation prompts" mitigation doesn't really apply here - the exfil happens without any tool call. the model just outputs a markdown link or raw URL in its response text, and the messaging app's preview system does the rest. there's no "tool use" to gate behind a confirmation. that's what makes this vector particularly nasty: it sits in the gap between the agent's output and the messaging layer's rendering behavior. neither side thinks it's responsible. the agent sees itself as just returning text; the messaging app sees itself as just previewing a link. network egress policies help but only if you can distinguish between "agent legitimately needs to fetch a URL for the user's task" vs. "agent was injected into constructing a malicious URL." that distinction is really hard to make at the network layer. |
|