|
|
|
|
|
by gajomi
2075 days ago
|
|
It seems to me that they are basically describing a variational formulation of the "optimization perspective" of reinforcement learning, which is cool, but I am confused... where is the supervised learning? Like what is the input and what is the output? |
|