|
|
|
|
|
by vinn124
3082 days ago
|
|
that is true: popular rl techniques (eg policy gradients) are very similar to "vanilla" supervised learning techniques and architectures, but they are unsupervised in the sense that they required zero human input. alphago zero is the canonical example of tabula rasa machine learning. |
|