Hacker News new | ask | show | jobs
by zambelli 25 days ago
Nice symmetry with tool call failures being sent to LLM that made the call without bugging the user. The artifact-generating entity gets the error back, effectively.

100% correct, and stackable. Could have topic refusal in LLM training itself, forge in tool call alter, and sdlc gates at the workflow level.

1 comments

Definitely stacks. The thing that made it clear for me was being explicit about the stages, and where/what you can verify with a guardrail, or gate. I wrote up the framework I use here: https://michael.roth.rocks/research/trust-topology/

Being explicit about the space between the stages is critical, because that's your enforcement point.

This is a really neat writeup, and the empirical data for coding agents is super useful. Will take a closer read and see if there's anything I easily lift into my harness!
Thanks, glad you find it useful! Feel free to ping me if you have any questions.