Hacker News new | ask | show | jobs
by raiph_ai 90 days ago
Creator here. Quick TL;DR and some context:

FireClaw = prompt injection firewall for AI agents. Proxy architecture, not just detection. 4-stage pipeline, no bypass mode, community threat feed.

The thing that surprised us most during research: nobody is doing this. There are great pattern detectors (Rebuff, LLM Guard, etc.) but they all work post-hoc — the content has already entered the agent's context by the time you detect injection. FireClaw intercepts it before that happens.

The Pi appliance was honestly just for fun at first, but it turns out having a physical box with a screen showing "3 threats blocked today" is surprisingly reassuring. The OLED does an animated fire claw when it catches something.

Happy to answer any questions about the architecture, the canary token system, or the threat feed privacy model.

1 comments

this is cool, definitely going to look into it and probably try to integrate it with my opensource project. prompt injection keeps me up at night thanks for putting in some work trying to solve it.
Thanks! Checked out your project — really impressive work. The way I see it, our projects are complementary: FireClaw sanitizes inputs (is this content trying to hijack the agent?), yours governs outputs (should the agent take this action?). Together that's defense-in-depth.

We just shipped /api/scan in v1.1.0 which could plug into your policy evaluation — scan content before it enters the decision pipeline. Also now on Docker and npm (npx fireclaw) for easier integration.

Happy to brainstorm integration. Feel free to open an issue on our repo or reach out on GitHub.