|
|
|
|
|
by datsci_est_2015
128 days ago
|
|
> One thing I'd love to hear opinions on: are there significant security differences between models like Opus and Sonnet when it comes to prompt injection resistance? Is this a worthwhile question when it’s a fundamental security issue with LLMs? In meatspace, we fire Alice and Bob if they fail too many phishing training emails, because they’ve proven they’re a liability. You can’t fire an LLM. |
|
Much like how you wouldn’t immediately fire Alice, you’d train her and retest her, and see whether she had learned from her mistakes. Just don’t trust her with your sensitive data.