|
|
|
|
|
by peripitea
2403 days ago
|
|
I'm not super familiar with AI/ML/RL at all, so I'm sure this is a naive question, but isn't it obvious that the answer is to just build in costs to the utility function for behaviors you want to avoid (what they seem to refer to as constrained RL in the article)? That seems both the simplest way to handle it, and most elegant in terms of mapping to the real world domain. Like are there alternate solutions that are even remotely competitive with this? I'm sure I must be oversimplifying and I assume that there's some nuance I'm missing. E.g. is this more about how you design those constraints to minimize the overall loss in learning efficiency, or something like that? |
|
Check out how Concrete Problems in AI Safety (Section 6 in particular is about safe exploration)
https://arxiv.org/pdf/1606.06565.pdf
Quote:
In practice, real world RL projects can often avoid these issues by simply hard-coding an avoidance of catastrophic behaviors. For instance, an RL-based robot helicopter might be programmed to override its policy with a hard-coded collision avoidance sequence (such as spinning its propellers to gain altitude) whenever it’s too close to the ground. This approach works well when there are only a few things that could go wrong, and the designers know all of them ahead of time. But as agents become more autonomous and act in more complex domains, it may become harder and harder to anticipate every possible catastrophic failure. The space of failure modes for an agent running a power grid or a search-and-rescue operation could be quite large. Hard-coding against every possible failure is unlikely to be feasible in these cases, so a more principled approach to preventing harmful exploration seems essential. Even in simple cases like the robot helicopter, a principled approach would simplify system design and reduce the need for domain-specific engineering