|
|
|
|
|
by alexgmcm
2894 days ago
|
|
A policy learning approach is better imho - but getting people to switch to using a multi-armed bandit when they are used to AB testing can be difficult. People don't seem to trust the system to make the right decisions even though you can do simulations and have the mathematics to show it is correct. |
|