|
|
|
|
|
by inverse_pi
2852 days ago
|
|
Here's a two sentence summary of SOTA: - model free methods have seen great success in terms of learning high dimensional tasks however it suffers from being sample inefficient. In other words, it takes too long for real robots. Examples of these methods are TRPO, PPO, ES, etc - model based methods is an order of magnitude more efficient, and thus, are more practical on real world robots. However, these methods have high bias and most working models are simple in terms of representation power, e.g. GP, time varying linear, mixture of Gaussians,. Examples are PILCO, GPS, PETS, etc Of course, SOTA is a lot more complicated but it's a short explanation to your observation. |
|