| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by DavidYoussef 138 days ago

A kill switch is treating symptoms. The disease is that maintainers have no way to distinguish "AI-generated typo fix from a new contributor" from "AI-generated rewrite of my auth system by someone who doesn't understand it."

  Both show up as PRs. Both require manual triage. The kill switch treats them
  identically by blocking both.

  What maintainers actually need is automated risk classification at the gate.
  We built a GitHub Action (codeguard-action, MIT) that does this:

  - Parses the diff, identifies what zones are touched
  - Classifies risk L0 through L4 based on what changed (not who submitted)
  - Runs proportional AI review (1-3 models depending on risk tier)
  - Posts structured findings to the PR
  - Seals everything into a hash-chained evidence bundle

  The "1 in 10 AI PRs is legitimate" stat from the GitHub discussion tells me
  9 out of 10 could be auto-filtered before a maintainer ever sees them. That's
  not a ban - it's triage.

  Daniel Stenberg shouldn't have had to kill curl's bug bounty. He needed a
  filter that could tell the difference between a real vulnerability report and
  AI-generated noise. Risk classification solves this without shutting down the
  program.