|
|
|
|
|
by vincentvandeth
120 days ago
|
|
Hard-coded checks before every action, plus a governance layer that separates "what the agent wants to do" from "what it's allowed to do."
The deeper issue: if your agent decides whether to issue a refund, you're solving the wrong problem with prompt guards. A refund is a deterministic business rule — order exists, within return window, amount matches. That decision shouldn't be made by an LLM at all. In my setup, agents propose actions and write structured reports. A deterministic quality advisory then runs — no LLM involved — producing a verdict (approve, hold, redispatch) based on pre-registered rules and open items. The agent can hallucinate all it wants inside its context window, but the only way its work reaches production is through a receipt that links output to a specific git commit, with a quality gate in between. For anything with real consequences (database writes, API calls, refunds), the pattern is: LLM proposes → deterministic validator checks → human approves. The LLM never has direct write access to anything that matters. "Just hoping for the best" works until it doesn't. We tracked every agent decision in an append-only ledger — after a few hundred entries, you start seeing exactly where and how agents fail. That pattern data is more useful than any prompt guard. |
|
I feel like this is the real key. LLMs are good at some things and bad at others. Deterministic logic (e.g. don't ever do "x") is not one of them.