|
|
|
|
|
by Terr_
608 days ago
|
|
> The key is doing it in a defensible manner, assuming the worst possible exploit at every angle. Red-team thinking, constantly. Principle of least privilege etc. My rule-of-thumb is to imagine all LLMs are client-side programs running on the computer of a maybe-attacker, like Javascript in the browser. It's a fairly familiar situation which summarizes the threat-model pretty well: 1. It can't be trusted to keep any secrets that were in its training data. 2. It can't be trusted to keep the prompt-code secret. 3. With effort, a user can cause it to return whatever result they want. 4. If you shift it to another computer, it might be "poisoned" by anything left behind by an earlier user. |
|