|
|
|
|
|
by ryanrasti
118 days ago
|
|
> decades ago securesm OSes tracked the provenience of every byte (clean/dirty), to detect leaks, but it's hard if you want your agent to be useful Yeah, you're hitting on the core tradeoff between correctness and usefulness. The key differences here:
1. We're not tracking at byte-level but at the tool-call/capability level (e.g., read emails) and enforcing at egress (e.g., send emails)
2. Agent can slowly learn approved patterns from user behavior/common exceptions to strict policy. You can be strict at the start and give more autonomy for known-safe flows over time. |
|
- summarize email to text file
- send report to email
the issue is tracking that the first step didnt contaminate the second step, i dont see how you can solve this in a non-probabilistic works 99% of the time way