Hacker News new | ask | show | jobs
by sirthomasjames 2643 days ago
I also found extremely hard to understand and extend the under-the-hood implementation. Couldn't grasp how the separation of concerns was split between the different classes. The documentation is lacking examples on how one of the algorithms (e.g. IMPALA, SAC, PPO) was built from scratch.