|
|
|
|
|
by fjdjshsh
265 days ago
|
|
>Things also become very difficult to reason about because their is state in the bandit stats that are being used to optimize things. You can often think of that as a black box, but sometimes you need to look inside and it can be very difficult. One way to peak into the state is to use bayesian models to represent the "belief" state of the bandits. For example, the arm's "utility" can be a linear function of the features of the arm. At each period, you can inspect the coefficients (and their distribution) for each arm. See this package: https://github.com/bayesianbandits/bayesianbandits |
|