Hacker News new | ask | show | jobs
by brockf 3473 days ago
Great points. It's definitely more challenging than learning to play a simple arcade game or something, where feedback is invariant and often instantaneous. To address these challenges, we use a combination of (1) heuristics tailoring our RL algorithms to the problem at hand, (2) many converging sources of feedback. Most importantly, as with any machine learning implementation, it works in practice — our AI-driven campaigns beat randomized, control conditions!