Hacker News new | ask | show | jobs
by simonw 234 days ago
Yeah, there remains a very real problem where a prompt injection against a system without external communication / ability to trigger harmful tools can still influence the model's output in a way that misleads the human operator.