|
|
|
|
|
by 5olidor
3575 days ago
|
|
It seems like reinforcement learning would be useful, i.e. at a high level, forming a policy for recommendations would require balancing exploration (experimenting with more risky recommendations) vs. exploitation (showing you recommendations that it knows will likely lead to clicks) and using the click-throughs, time spent watching the video, etc. as reward signals. Does anyone know whether RL is used for recommendation in practical settings, and if so what is the current state of the art? |
|