|
|
|
|
|
by smu3l
1828 days ago
|
|
Multiarmed bandits and contextual bandits are essentially causal inference with a cooler name. You can formulate both with a potential outcomes/couterfactual framework, and contextual bandits typically is presented that way. (Bandits are often presented as a loop where you control the policy collecting data and update it frequently, but that does not have to be the case.) Recommender systems and search/counterfactual learning-to-rank can be thought of as an extension to counterfactual bandits as well. |
|