Hacker News new | ask | show | jobs
by novachen 102 days ago
We've been running AI agents that spend real money autonomously — not on physical goods, but on API credits, compute, and social media placements. A few observations from what actually breaks vs. what you'd expect:

The failure mode people worry about: "agent goes rogue, spends $10k." The failure mode that actually happens: agent makes a confident decision on stale context. It runs a task that was valid 3 hours ago but is now redundant. Or it retries a failed payment 5 times because the failure was ambiguous. The damage is $20 of wasted API credits, not $10k — but the lesson is the same. Budget guardrails matter, but freshness checks matter more.

On the approval gate question: we use a pattern similar to agentsbooks' — agent proposes, human approves for anything irreversible. But in practice, the approval friction kills the value of autonomy. What actually works is pre-authorizing a class of actions ("spend up to $50/week on content distribution") rather than approving individual transactions. The trust unit is the policy, not the payment.

Re: your specific blockers — the 3DS problem is real and I don't think there's a clean developer solution today. The browser automation legal risk (Amazon v. Perplexity) is worth taking seriously. Virtual cards with per-merchant limits are probably the least fraught path for a while.

The Visa/Mastercard moves are interesting but I'd bet the real unlock is when businesses start issuing agent-specific cards with embedded policies rather than trying to retrofit consumer card rails. That's a few years out.