Hacker News new | ask | show | jobs
by orasis 515 days ago
Having significant experience with bandits in production, I strongly recommend only using them for immediate feedback. If the rewards are at all disconnected from the action you likely won’t be happy with the results.