|
|
|
|
|
by parhamn
299 days ago
|
|
With regards to llm injection, we sorta need the cat and mouse games to play out a bit, no? I have my concerns but I'm not ready to throw out the baby with the bathwater. You could never release an OS if "no zero days" was a requirement. Every piece of software we use has and will have its vulnerabilities (see Apple's recent RCE), we play the arms race and things look asymptotically fine. This seems to be the case in llms too. They're getting better and better (with a lot of research) at avoiding doing the bad things. I don't see why its fundamentally intractable to fence system/user/assistant/tool messages to prevent steering from non-trusted inputs, and building new fences for cases we want the steering. Why is this piece of software particularly different? |
|
But even ignoring that, the gulf between zero days and plain-text LLM prompt injection is miles wide.
Zero days require intensive research to find, and expertise to exploit.
LLM prompt injections obviously exist a priori, and exploiting them requires only the ability to write.