| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by tiny-automates 132 days ago

agree that this is a protocol-level issue, not framework-specific. but the "all external tool calls require confirmation prompts" mitigation doesn't really apply here - the exfil happens without any tool call.

the model just outputs a markdown link or raw URL in its response text, and the messaging app's preview system does the rest. there's no "tool use" to gate behind a confirmation. that's what makes this vector particularly nasty: it sits in the gap between the agent's output and the messaging layer's rendering behavior.

neither side thinks it's responsible. the agent sees itself as just returning text; the messaging app sees itself as just previewing a link. network egress policies help but only if you can distinguish between "agent legitimately needs to fetch a URL for the user's task" vs. "agent was injected into constructing a malicious URL."

that distinction is really hard to make at the network layer.