|
|
|
|
|
by nmca
2891 days ago
|
|
One approach being considered is "AI Safety Via Debate"[0], which hopes to prevent deception by carefully constructing games in which a superhuman agent's best strategy is honesty. Note that this is the goal; much work to be done! [0] https://arxiv.org/abs/1805.00899 |
|
I have pondered if it would be a workable field to have incentive based design in a formalized way to ensure that even a complete sociopath would find acting in a beneficial way the best option.