| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by timeleft-- 150 days ago
	Thanks for the great observations. I haven't done a formal benchmark, but I built this because I saw this failure mode in several production agentic systems. The quality of the trust-gate will not be a one and done prompt, but will have to iterate on that and will most likely be application specific.