|
|
|
|
|
by FooBarWidget
1166 days ago
|
|
The way I think about this is that we need to treat AIs as human employees that have a chance of going rogue, either because of hidden agendas or because they've been deceived. All the human security controls then apply: log and verify their actions, don't give them more privileges than necessary, rate limit their actions, etc. It's probably impossible to classify all possible bad actions in a 100% reliable manner, but we could get quite far. For example detecting profanity should be as simple as filtering the output through a naive Bayesian classifier. Everything that's left would then be a question of risk acceptance. |
|