|
|
|
|
|
by tastroder
2078 days ago
|
|
They mention Rainbow in the related work section of the third paper listed there, Kumar, A., Peng, X. B., & Levine, S. (2019). Reward-Conditioned Policies. arXiv:1912.13465 as part of this remark: "they are also known to be notoriously challenging to use effectively, due to sensitivity to hyper parameters, high sample complexity, and a range of important and delicate implementation choices that have a large effect on performance [5, 6, 12, 15, 23, 24, 46]." |
|