| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by NicolaiS 428 days ago

Sorry, but this will never work very well.

The tool contains a bunch of "denylist regexes", i.e.

    `user (should not|must not|cannot) see`

But these can easily be bypassed. Any real security tool should use allowlists, but that is ofc much harder with natural languages.

MCP-Shield can also analyse using Claude, but that code contains an easy to exploit prompt injection: https://github.com/riseandignite/mcp-shield/blob/19de96efe5e...