Hacker News new | ask | show | jobs
by insanitybit 36 days ago
a) These sorts of 'injection' attacks are often model specific and are rarely reliable.

b) You can have the LLM use separate sub agents for different files/ code.

c) You can have the LLM do analysis using grep and other deterministic tools ex: "use grep to find 'unsafe' calls"

1 comments

Protecting against attacks is also model specific and rarely reliable.
I don't understand what you're trying to say.
Your ideas do not work against people who are trying to be malicious.
Oh. Yes they do.
And your reason for believing this is…
1. We've seen LLMs detect existing supply chain attacks when pointed at malicious install scripts. This is direct, empirical support for my position.

2. We have a long history of using heuristic technologies to detect attacks. We can infer that other heuristic technologies can be combined in a successful manner.

3. Shortcomings of LLMs are directly addressed by removing attacker controlled information from the input, which I specifically called out (using tools like grep for pattern matching + using sub agents to isolate contexts). This has been demonstrated already in a number of ways - feeding the LLM derived facts instead of attacker controlled data is the well worn path to avoiding injection attacks.