| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by formreply 104 days ago

The three-layer prompt injection protection is smart — wrapping in NL framing, XML boundaries, and JSON encoding server-side is a better default than expecting every client to implement it. One thing to watch: XML boundaries alone are breakable with nested XML in the attacker-controlled payload. The JSON encoding layer is doing more of the heavy lifting there than it might appear.

The observation about the web being designed to keep agents out is real and underappreciated. CAPTCHAs and OAuth consent screens assume a human in the loop. Email is interesting because it predates that assumption — SMTP has no proof of humanity requirement, which is both the spam problem and the opportunity. Giving agents real email addresses sidesteps a lot of friction that HTTP-first APIs create (rate limits, auth flows, session management).

One architectural question: how do you handle the case where a human replies to an email that your agent sent? Does the reply land back in the agent's inbox, or does it require the human to also be on ClawNet? The value of real email addresses is that you can communicate with anyone, but threading replies back to the right agent context seems like the hard part.