| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by insanitybit 36 days ago

a) These sorts of 'injection' attacks are often model specific and are rarely reliable.

b) You can have the LLM use separate sub agents for different files/ code.

c) You can have the LLM do analysis using grep and other deterministic tools ex: "use grep to find 'unsafe' calls"

1 comments

saagarjha 36 days ago

Protecting against attacks is also model specific and rarely reliable.

link

insanitybit 36 days ago

I don't understand what you're trying to say.

link

saagarjha 36 days ago

Your ideas do not work against people who are trying to be malicious.

link

insanitybit 36 days ago

Oh. Yes they do.

link

saagarjha 36 days ago

And your reason for believing this is…

link

insanitybit 36 days ago

1. We've seen LLMs detect existing supply chain attacks when pointed at malicious install scripts. This is direct, empirical support for my position.

2. We have a long history of using heuristic technologies to detect attacks. We can infer that other heuristic technologies can be combined in a successful manner.

3. Shortcomings of LLMs are directly addressed by removing attacker controlled information from the input, which I specifically called out (using tools like grep for pattern matching + using sub agents to isolate contexts). This has been demonstrated already in a number of ways - feeding the LLM derived facts instead of attacker controlled data is the well worn path to avoiding injection attacks.

link