Hacker News new | ask | show | jobs
by vinn124 3082 days ago
that is true: popular rl techniques (eg policy gradients) are very similar to "vanilla" supervised learning techniques and architectures, but they are unsupervised in the sense that they required zero human input.

alphago zero is the canonical example of tabula rasa machine learning.