|
|
|
|
|
by jmward01
770 days ago
|
|
I modeled part of my company's business problem as a MAB problem and saved my company 10% off their biggest cost and, just as important, showcased an automated truth signal that helped us understand what was, and wasn't, working in several of our features. Like all tools, finding the right place to use RL concepts is a big deal. I think one thing that is often missed in a classroom setting is pushing more real world examples of where powerful ideas can be used. Talking about optimal policies is great, but if you don't help people understand where those ideas can be applied then it is just a bunch of fun math. (which is often a good enough reason on its own :) |
|
In my limited understanding, MAB problems are simpler than those tackled by Deep Reinforcement Learning (DRL), because typically there is no state involved in bandit problems. However, I have no idea about their scale in practical applications, and would love to know more about said business problem.
[1] https://en.wikipedia.org/wiki/Multi-armed_bandit