Hacker News new | ask | show | jobs
by smu3l 1828 days ago
Multiarmed bandits and contextual bandits are essentially causal inference with a cooler name. You can formulate both with a potential outcomes/couterfactual framework, and contextual bandits typically is presented that way.

(Bandits are often presented as a loop where you control the policy collecting data and update it frequently, but that does not have to be the case.)

Recommender systems and search/counterfactual learning-to-rank can be thought of as an extension to counterfactual bandits as well.

1 comments

Hmmm. I don't see any connection between Recommender systems and causal inference.
If it helps, one connection I see is that recommender systems often involve causal questions like: “how will user behavior change if we change the order these results appear in, or if we change which results appear in the first page of results, etc.”. Additionally, since we can only show one ranked set of answers for each query, counterfactual questions also rapidly arise about what would have happened if we had answered past queries differently.