Hacker News new | ask | show | jobs
by mansi_mittal 152 days ago
This resonates with real pain I’ve seen once agent systems leave the demo phase. Most failures aren’t model quality issues — they are retries with side effects, partial execution failures, and lack of visibility/control once things are running live. The idea of a lightweight, inline control plane that doesn’t replace the orchestrator but governs execution step-by-step feels like a pragmatic way to tackle that.

I especially liked the ability to start in observe-only mode and progressively enforce policies, and the focus on auditability and permissions per step. That’s the kind of thing teams usually end up building ad-hoc once compliance or reliability becomes non-optional.

A couple of things I’m curious about (and would love your thoughts on): 1. How you think about debugging or replay for long-running/stateful workflows when enforcement decisions affect downstream steps

2. What you’re seeing in practice around latency overhead at scale when AxonFlow is fully in the hot path

Overall, this feels like it’s aimed at the right stage of maturity — not early demos, but teams already feeling production constraints.

1 comments

Thanks for the thoughtful read. You’ve described exactly the maturity stage we’re targeting: past demos, dealing with retries, partial failures, side effects, and the need for real control once systems are live.

On your questions:

1. Debugging and replay for stateful workflows

We capture step-level execution snapshots across the workflow. Each snapshot records inputs, outputs, duration, tokens, cost, evaluated policies, triggered policies, and the resolution (approved, blocked, overridden).

For enforcement-specific debugging, each snapshot includes which policy matched, what content triggered it, and how it was resolved. When a downstream step fails because an upstream step was blocked or modified, you can trace the execution timeline and see exactly where and how the data flow changed.

We also support human in the loop pause and resume. A step can be paused for approval and later resumed, with the decision and rationale recorded as part of the execution history.

This is not full deterministic replay yet, meaning re-running with identical LLM outputs, but it provides enough visibility to answer “what happened” and “why” in production, which covers most real debugging scenarios.

2. Latency overhead at scale

We operate in two modes depending on requirements:

- Compliance mode: policy violations and blocked requests are written synchronously before returning. This adds a few milliseconds for violation cases, but guarantees the audit record exists before the caller sees the result.

- Performance mode: audit writes are queued asynchronously. Policy evaluation still happens inline, since it may block execution, but persistence is decoupled using bounded queues and worker goroutines.

Most policies are rule-based and pattern matching rather than LLM calls. In practice, teams see single-digit millisecond overhead per request for typical policy sets. Heavier redaction or more complex policies can increase this, but the behavior remains predictable.

Observe-only mode adds essentially no latency beyond the audit write, since no blocking decisions are made.

On orchestration boundaries:

AxonFlow does not require replacing your existing orchestrator. Most teams keep LangChain, LangGraph, or CrewAI for stateful workflow execution and use AxonFlow as a step-level control plane, adding policy gates before each step runs.

For teams building from scratch or wanting tighter integration, AxonFlow can also handle orchestration end to end with governance built in. In practice, most start by adding governance to existing workflows and only consider deeper orchestration later.

For related discussion on how we think about the observability to enforcement gap, there’s a deeper thread here that may be relevant: https://news.ycombinator.com/item?id=46603800

Happy to go deeper on any of this if useful.