|
|
|
|
|
by hooande
2386 days ago
|
|
they (basically) applied the ideas from a bot that plays poker to another game. it's interesting work, though perhaps not groundbreaking. This idea of selfplay + counterfactual regret minimization does seem to be the superior way to solve game theoretic problems. Identifying valuable game theoretic problems remains a challenge... |
|
The most surprising takeaway is just how effective search was. People were viewing Hanabi as a reinforcement learning challenge, but we showed that adding even a simple search algorithm can lead to larger gains than any existing deep RL algorithm could achieve. Of course, search and RL are completely compatible, so you can combine them to get the best of both worlds, but I think a lot of researchers underestimated the value of search.