Hacker News new | ask | show | jobs
by verdverm 98 days ago
If you can't trace across agents (like services), then you haven't set up OTEL completely

What your hard fail is, that's at a different layer of control, separate from OP questions about just seeing it so you can design those control systems. That's more guards, validators, and the like (more subagents)

I stay more human in the loop because these things are not ready for prime time the way you describe using them. That's burning tokens on average imo.

1 comments

That makes sense — sounds like a lot of this is handled at the framework + design level in your setup.

In practice, when something does go wrong in a multi-step workflow, do you typically rely on tracing + manual debugging, or do you have built-in mechanisms for partial replay / recovery?