| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by cruffle_duffle 437 days ago

Damn. As somebody who was in the “there needs to be an out of band way to denote user content from ‘system content’” camp, you do raise an interesting point I hadn’t considered. Part of the agent workflow is to act on the instructions found in “user content”.

I dunno though maybe the solution is like privilege levels or something more than something like parametrized SQL.

I guess rather than jumping to solutions the real issue is the actual problem needs to be clearly defined and I don’t think it has yet. Clearly you don’t want your “user generated content” to completely blow away your own instructions. But you also want that content to help guide the agent properly.

1 comments

thwarted 437 days ago

> Clearly you don’t want your “user generated content” to completely blow away your own instructions.

It's the same problem as "ignore all previous instructions" prompt injection, but at a different layer.

link