| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by kestiny 32 days ago
	A good harness should not only make agents more capable at completing tasks, but also make their outputs much easier to review. For example: A good harness constrains the action surface, context, and task boundaries. An agent’s failure isn’t always due to “writing incorrect code” — it can also result from “doing things it wasn’t supposed to do.” Tests and lints can verify part of the correctness, but they often fail to validate task scope. A well-designed harness should shift the review process from “reading the entire diff” to “verifying whether the changes stay within the defined task boundaries.”