Hacker News new | ask | show | jobs
by calpaterson 81 days ago
Yes, I of course link to this post, which I think is great. But I think actually it understates the case. All three parts of the trifecta (untrusted content, private data and external comms) are not necessary. Really, the key problem is just untrusted content in the context window. Access to private data and the ability to communicate externally are just modalities in which damage can occur.

For example: imagine having just untrusted content and private data (2/3 parts of the trifecta). The untrusted content can use a "Disregard that!" attack to cause the LLM to falsely modify the private data. So I think the whole "trifecta" is not necessary and the key thing is that you simply can't have untrusted stuff in your context window at any point.

1 comments

Oh yeah. I think simonw has created good vocab to talk about attacks but the trifecta is just one way to attack.

The difecta is:

* LLM can do something you'd rather it not.

* LLM reads untrusted text.