Hacker News new | ask | show | jobs
by jakobnicolaus 2698 days ago
Thanks for your summary of Hanabi! You can find an example of your hypothetical AI in our recent paper: https://arxiv.org/abs/1811.01458. Note that all the conventions and rules are learned though, rather than hand coded. And your 'giant lookup table', is just a feedforward neural network. We also link out 300 games in case you are curious to see how it works.
2 comments

I've only Hanabbed it up a handful of times (got 25 w strict play, no mistakes w my brother over Christmas!), but I've been of the conviction that there is too much emphasis on these "conventions," which either require a pregame explanation and agreement, or more often in-game argument (which is against the rules). I prefer an emphasis on card holding that constantly displays your knowledge of your own hand. To me this isn't against the communication rule, because the cards have to be on display. They can be held sideways, higher or lower, upside down (The cards are orientable on both faces, people don't notice this!), I hold them between certain fingers to remember number values. While that's just another convention, one can observe it develop from the beginning game state, and every time a player receives a clue, how they redo their hand. In summary, card holding is very effective/information rich and can be deduced without words, and seems to me perfectly natural to the game. This also relieves a lot of the cognition devoted to remembering what other people know about their own hands. When discussing a UI for a digital version of Hanabi, my brother and I discussed mostly how customizable the card holding would be, which would be exactly reflected to the other players.
Thanks for the paper, that is really interesting.

For those who read my previous comment, an example of pre-game communication collusion that the AI in the paper invented was to decide that any hint involving red or yellow _also_ means that your most recently acquired tile is immediately playable.

"Roughly 40% of the information is obtained through conventions rather than through the grounded information and card counting"

(what I have been describing as "collusion" the paper describes as "conventions" but it amounts to the same thing -- you can pass a lot more information than the hints imply if you can plan ahead)

That is super fascinating.

yes - this was the focus of our method: Allowing agents to interpret the actions of others, while also learning to be interpretable when observed by other agents.