|
|
|
|
|
by est31
2399 days ago
|
|
The approach you describe is mentioned in the article as "normal RL". Constrained RL is an advanced mode of it where you are given direct control over how often some safety constraint should be violated. Basically constrained RL is just automating away the part where you are manually adjusting the "normal RL" punishments to fit your constraints. |
|