| Hi HN, OP here. I’m a former Enterprise Systems Architect (Cisco/VMware) turned "vibe coder." I realized quickly that AI coding is dangerous because LLMs lack *context* and *verification*. They hallucinate because they are guessing at the file structure. So, out of pure spite for flaky tools, I built *TheAuditor*. *The Concept:*
Instead of grepping files, I index the entire repo (Python, TS, Go, Rust, Terraform, CDK) into a local SQLite database (~180MB for a mid-sized repo). Because the code is in a DB, I can query the call graph like SQL. *The Tech (The "Hard" Part):*
I needed a way to trace data flow through the infrastructure to prevent the AI from introducing vulnerabilities. I ended up building a *Hybrid Taint Engine* that extends the Oracle Labs (2021) IFDS research:
1. *Forward Flow:* Traces entry points to reachable sinks to prune the graph.
2. *Backward IFDS:* Runs a precise "Interprocedural Finite Distributive Subset" analysis on the pruned graph.
3. *The Handshake:* We only report vulnerabilities where both engines intersect. *The "Systems Architect" approach:*
Coming from a background in critical infrastructure, I hate silent failures. I implemented a *5-Layer Fidelity System*. Every parser emits a cryptographic manifest. If the DB storage receipt doesn't match the manifest (transaction mismatch or data loss), the tool hard-crashes. I'd rather a stack trace than a false negative. *Why I built it:*
I use this as a "Flight Computer" for my AI agent.
* Before refactoring, it runs `aud impact` to calculate the blast radius.
* Before writing code, it runs `aud explain` to get a token-optimized context bundle of definitions. This is v2 (a complete rewrite after 800 commits). I learned a lot since my first attempt. The code is open source (AGPL). Happy to answer questions about the SQLite schema or the IFDS implementation. |