|
|
|
|
|
by jefft255
2399 days ago
|
|
That works but to learn to avoid these "bad" things, in the setting you describe, the agent has to first make those mistakes and learn from them. There are mistakes we don't want the agent to make, ever. That's what safe RL is about. |
|