|
|
|
|
|
by sanxiyn
3670 days ago
|
|
I find this comment thread disappointing because no one seems to comment on the paper, which is quite technical. From the abstract: "We provide a formal definition of safe interruptibility and prove that Q-learning is already safely interruptible, and Sarsa is not but can easily be made so." |
|
Basically, the paper discusses ways in which learning agents "will not learn to prevent (or seek!) being interrupted by the environment or a human operator. We provide a formal definition of safe interruptibility and exploit the off-policy learning property to prove that either some agents are already safely interruptible, like Q-learning, or can easily be made so, like Sarsa."[1]
It's an interesting result, and can probably be extended to other less hype-worthy scenarios.
[1] http://intelligence.org/files/Interruptibility.pdf