|
|
|
|
|
by klipt
2078 days ago
|
|
> empirically demonstrate improved goal-reaching performance and robustness over current RL algorithms It's interesting that their choice of current algorithms includes PPO but not e.g. Deepmind's Rainbow agent that achieved state of the art performance on many measures: https://arxiv.org/abs/1710.02298 |
|