| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by warwickmcintosh 78 days ago
	The sanitised optimism problem mentioned upthread is the real gap here. Event stream logging tells you what tools were called and in what order, but it doesn't tell you whether the agent's self-reported outcome matches reality.