|
|
|
|
|
by TeMPOraL
348 days ago
|
|
Fair. > You just design the system to assume the LLM output isn't predictable, come up with invariants you can reason with, and drop all the outputs that don't fit the invariants. Yes, this is what you do, but it also happens to defeat the whole reason people want to involve LLMs in a system in the first place. People don't seem to get that the security problems are the flip side of the very features they want. That's why I'm in favor of anthropomorphising LLMs in this context - once you view the LLM not as a program, but as a something akin to a naive, inexperienced human, the failure modes become immediately apparent. You can't fix prompt injection like you'd fix SQL injection, for more-less the same reason you can't stop someone from making a bad but allowed choice when they delegate making that choice to an assistant, especially one with questionable intelligence or loyalties. |
|
Everyone who's worked in big tech dev got this the first time their security org told them "No."
Some features are just bad security and should never be implemented.