|
|
|
|
|
by Terr_
533 days ago
|
|
To use a narrow interpretation of "prompt injection", it comes from how all data is one undifferentiated stream. The LLM [0] isn't designed to detect self/other, let alone higher-level constructs like truth/untruth, consistent/contradictory, a theory-of mind for other entities, or whether you trust the motives of those entities. So I'd say the human equivalent of LLM prompt injection is whispering in the ear of a dreaming person to try to influence what they dream about. That said, I take some solace in the idea that humans have been trying to hack other humans for thousands of years, so it's not as novel a problem as it first appears. [0] Importantly, this is not be confused with characters that human readers may perceive inside LLM output, where we can read all sort of qualities including ones we know the author-LLM does not possess. |
|