| Hi HN, we’re building AxonFlow for teams running LLMs or agents in real production systems. Once agent workflows move past demos, failures are rarely model issues. They tend to show up as execution problems during real runs. Short 2-minute technical demo showing execution control and auditability in practice:
https://youtu.be/FNgnESo9RtI AxonFlow is a self-hosted, source-available (BSL 1.1) control plane that sits inline in the execution path and governs LLM calls, tool calls, retries, approvals, and policy enforcement step by step. It does not replace your orchestrator and can run alongside LangChain, CrewAI, or custom systems. The problems we focus on are usually discovered only after going to production:
- retries that accidentally repeat side effects
- partial failures mid-workflow
- permissions that differ per step
- limited ability to inspect or intervene during execution This is not aimed at early demos or hobby projects. It’s for teams already operating under real production constraints. GitHub:
https://github.com/getaxonflow/axonflow Docs:
https://docs.getaxonflow.com I’d value feedback from folks running LLM or agent workflows in production. |
I especially liked the ability to start in observe-only mode and progressively enforce policies, and the focus on auditability and permissions per step. That’s the kind of thing teams usually end up building ad-hoc once compliance or reliability becomes non-optional.
A couple of things I’m curious about (and would love your thoughts on): 1. How you think about debugging or replay for long-running/stateful workflows when enforcement decisions affect downstream steps
2. What you’re seeing in practice around latency overhead at scale when AxonFlow is fully in the hot path
Overall, this feels like it’s aimed at the right stage of maturity — not early demos, but teams already feeling production constraints.