Hacker News new | ask | show | jobs
by 5olidor 3575 days ago
It seems like reinforcement learning would be useful, i.e. at a high level, forming a policy for recommendations would require balancing exploration (experimenting with more risky recommendations) vs. exploitation (showing you recommendations that it knows will likely lead to clicks) and using the click-throughs, time spent watching the video, etc. as reward signals.

Does anyone know whether RL is used for recommendation in practical settings, and if so what is the current state of the art?

1 comments

This is a very natural avenue and an active area of research at Google/Deep Mind. Stay tuned...