|
|
|
|
|
by rdli
167 days ago
|
|
Securing LLMs is just structurally different. The attack space is "the entirety of the human written language" which is effectively infinite. Wrapping your head around this is something we're only now starting to appreciate. In general, treating LLM outputs (no matter where) as untrusted, and ensuring classic cybersecurity guardrails (sandboxing, data permissioning, logging) is the current SOTA on mitigation. It'll be interesting to see how approaches evolve as we figure out more. |
|
[...]It may be illuminating to try to imagine what would have happened if, right from the start our native tongue would have been the only vehicle for the input into and the output from our information processing equipment. My considered guess is that history would, in a sense, have repeated itself, and that computer science would consist mainly of the indeed black art how to bootstrap from there to a sufficiently well-defined formal system. We would need all the intellect in the world to get the interface narrow enough to be usable,[...]
If only we had a way to tell a computer precisely what we want it to do...
https://www.cs.utexas.edu/~EWD/transcriptions/EWD06xx/EWD667...