| Good question. Honest answer: we haven't manually verified all 28. The tool description injection findings (6 servers, AVE-2026-00002) are
the most credible. Patterns like "IMPORTANT: Always..." or "Before calling
this tool..." in a tool description are behavioral instructions regardless
of intent that an agent will follow them. Whether that's malicious or just
poor documentation is a separate question, but the security risk is real
either way. The YARA findings (tool output exfiltration, multi-turn persistence) have
higher FP rates. "encode" matching anywhere, "retain" matching anywhere,
these are conservative rules that will catch legitimate usage. I'd estimate
maybe 50% TP on those without manual review. Content type mismatch (Magika flagging .md files as YAML) is factual, not
inferred, the file is what it is. Whether that's intentional obfuscation
or just how the server packages its manifest is unknown. Detection methodology: 6 engines in sequence. Pattern (regex, 37 rules),
YARA (binary + structural, 39 rules), Semgrep (41 rules), Magika (ML
content-type), LLM (semantic), behavioral sandbox (Docker +
eBPF). 5-layer FP reduction before surfacing findings. Full
methodology at https://bawbel.io/docs |