| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by zambelli 25 days ago
	Nice symmetry with tool call failures being sent to LLM that made the call without bugging the user. The artifact-generating entity gets the error back, effectively. 100% correct, and stackable. Could have topic refusal in LLM training itself, forge in tool call alter, and sdlc gates at the workflow level.

1 comments

mrothroc 24 days ago

Definitely stacks. The thing that made it clear for me was being explicit about the stages, and where/what you can verify with a guardrail, or gate. I wrote up the framework I use here: https://michael.roth.rocks/research/trust-topology/

Being explicit about the space between the stages is critical, because that's your enforcement point.

link

zambelli 24 days ago

This is a really neat writeup, and the empirical data for coding agents is super useful. Will take a closer read and see if there's anything I easily lift into my harness!

link

mrothroc 23 days ago

Thanks, glad you find it useful! Feel free to ping me if you have any questions.

link