|
|
|
|
|
by Wowfunhappy
92 days ago
|
|
The way to solve it is to make the AI “smart” enough to understand it’s being tricked, and refuse. Whether this is possible depends almost entirely on how much better we’re able to make these LLMs before (if) we hit a wall. Everyone has a different opinion on this and I absolutely don’t know the answer. |
|
Alternatively, you may make an agent too sensitive to trickery that refuses to do anything outside of what it thinks is right. If it somehow thinks that running malware or deleting / is the correct action to take, how can you stop it?