Hacker News new | ask | show | jobs
by tastroder 2078 days ago
They mention Rainbow in the related work section of the third paper listed there, Kumar, A., Peng, X. B., & Levine, S. (2019). Reward-Conditioned Policies. arXiv:1912.13465 as part of this remark: "they are also known to be notoriously challenging to use effectively, due to sensitivity to hyper parameters, high sample complexity, and a range of important and delicate implementation choices that have a large effect on performance [5, 6, 12, 15, 23, 24, 46]."