|
|
|
|
|
by smokel
770 days ago
|
|
For those not in the know, "MAB" is short for Multi-Armed Bandit [1], which is a decision-making framework that is often discussed in the broader context of reinforcement learning. In my limited understanding, MAB problems are simpler than those tackled by Deep Reinforcement Learning (DRL), because typically there is no state involved in bandit problems. However, I have no idea about their scale in practical applications, and would love to know more about said business problem. [1] https://en.wikipedia.org/wiki/Multi-armed_bandit |
|