Hacker News new | ask | show | jobs
by maskil 613 days ago
You still will need clear benchmarks as the reward for RL. With Chess, the rules are simple but you may not have a clear loss function for a complicated architectural challenge.