Hacker News new | ask | show | jobs
by qsort 1158 days ago
The thing is that security is binary. One input out of a billion causes bad behavior and you're fucked, exactly like eval, execvpe, sql injections and all their relatives.

The point isn't that you can't use LLM output, it's that you should always consider LLM output as potentially hostile. You can somewhat mitigate this by pairing a LLM with a deterministic system that only allows a predictable subset of behavior, but it's a tricky problem to remove completely.

1 comments

> you should always consider LLM output as potentially hostile

Sure, agreed. How is that different from human output?

"Human output" isn't automated nor connected to your production systems. Would you let any random user run arbitrary SQL against your production DB?
Not a random user, but an employee called or emailed by a random social engineer yes. Notably, most real "hacking" is social engineering and LLM prompt exploitation seems more like an extension of SE than technical hacking.
Is there a reason why most hacking is through social engineering? Possibly because that's often the weakest part of the entire security chain, specifically because humans are involved, and thus it's nearly always the lowest-hanging fruit for an attacker to target?

Is that a pattern we should be expanding? For sure, make the comparison when using GPT to aid with human tasks that can't be automated through any other means; but if you have a task that can be done just with a computer and without getting a human involved, it seems like a strict downgrade in security to involve an LLM into the middle of it.

It's really good for security and reliability that there isn't a second human involved on top of me that I need to go through to add a calendar appointment to my phone.

>Human output" isn't automated nor connected to your production systems.

Err... what?

How do you think businesses work?