|
|
|
|
|
by arw0n
263 days ago
|
|
The biggest and most difficult to mitigate attack vector is indirect prompt injection.[0] So far most case studies have been injecting malicious prompts at inference, but there is good reason to believe you can do this effectively at different stages of training as well.[1] By layering obfuscation techniques, these become very hard to detect. [0] https://arxiv.org/abs/2302.12173 [1] https://arxiv.org/html/2410.14827v3 |
|