|
|
|
|
|
by pigscantfly
3628 days ago
|
|
I work on a few algorithms that could be classified as RL given an open mind. Most of them learn distributions from streaming data via some kind of online EM. I know that people in the ad-serving, porn-serving, and website optimization (A/B stuff) sectors use RL pretty extensively as well, but I'm not one of them at the moment. |
|
That's unsupervised learning afaik - clustering, manifolds etc. Where is "reinforcement" part there (agent, environment, reward)?