Hacker News new | ask | show | jobs
by ryanshrott 56 days ago
One practical approach that works is separating the capture layer from the promotion layer. Agents can draft freely, but anything that gets promoted to trusted status needs a human review. Some teams use a voting scheme where multiple agents independently summarize the same source, and you only promote it when they converge. The confidently wrong problem gets worse over time because bad entries get cited by other agents, and that's how you end up with a knowledge base full of confident BS.
1 comments

The "draft freely, promote on approval" method is the only thing I think works. Anything else is open to way too many forms of context poison. And you're either buried in writing safeguards, adding review layers, or you're praying you don't hit edge cases.

You don't have to trust the capture layer. Put a reviewer agent on top with memory of what's been approved and rejected, keep a human in the loop on the close calls. Over time the reviewer gets calibrated and the human review queue shrinks.