Hacker News new | ask | show | jobs
by bisonbear 76 days ago
PRs for AGENTS.md are necessary, but not sufficient, exactly because of non-determinism. You can LGTM the AGENTS.md change, but it's so hard to know what downstream behavioral effects it has. I feel like the only way to really know is by building a benchmark on your repo, and actually A/B testing the AGENTS.md change. I'm building something in the space - happy to share if it's something that sounds interesting to you