| I've been using Claude Code Max and Codex daily and kept hitting the same problem; AI quickly ships working code that have real issues: logic errors, security gaps, subtle regressions. You catch them in review, fix them, but the agent session has already closed. Doesn't it make sense to have the AI fix its own mistakes
while it still knows why it made them? Saguaro is a background daemon that reviews AI-generated code and feeds findings
back to the same agent that wrote it. The agent evaluates the critique, it knows why it made those decisions in the first place, and self-corrects what's actually wrong. The flow: you tell Claude Code to build something. Claude writes code. Saguaro's
stop hook triggers a background review (the user sees nothing). On the next turn, findings come back to Claude. Claude says "I see some issues with my approach, fixing now" and corrects itself. No human typed anything. No blocking. It uses your existing Claude Code / Codex / Gemini subscription. No API key
needed. No external account. Everything runs locally. The daemon self-spawns on demand and auto-shuts down after 30 minutes of inactivity. There's also a rules engine for teams that want more deterministic enforcement.
You write rules as markdown files with YAML frontmatter, scoped to specific
file globs. But the daemon works out of the box with zero rules. It reviews like a senior staff engineer: bugs, security, regressions, dead code. The rules engine adds more precision for teams/individuals that need it. Setup is "sag init" + restart CC + go back to coding. That's it. Apache-2.0. TypeScript. |
Curious about one tradeoff though: by the time Saguaro catches a bug in the next hook cycle, Claude has already moved on and built more code on top of the broken foundation. Does it handle cascading fixes well? Like if Claude wrote a broken function in file A, then imported and used it in files B and C before Saguaro flags it — does the fix propagate cleanly or does it sometimes cause a chain reaction of corrections?
I've been experimenting with the opposite approach — validating the AST before the write hits disk, so broken code never lands. Catches syntax issues instantly but obviously can't catch logic bugs the way a full review daemon can. Feels like both approaches together would be the ideal setup