|
|
|
|
|
by MichaelRazum
364 days ago
|
|
TLDR, just use PPO? I always found it kind of confusing, that on paper SAC or other algorithms seem to be much more sample efficient - but in practice it looks, as the author mentioned that they often do not work. PS: Not sure where to put algorithms like TD-MPC or DreamerV3, since they are kind of in between, right? |
|