| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by cepth 2447 days ago

As an addendum to this very good comment, that Nature article had a couple of big asterisks.

* Bet sizes were restricted. E.g. humans and the the bot could only bet fixed bet sizes, like 1/4 pot, 1/2 pot, full pot etc. Creative bet sizing is one of the skills that distinguishes top pros.

* Stack sizes were reset after every hand. E.g. every player in the hand was given the same amount of chips at the start of every hand. How you performed previously in the session thus did not matter. Anyone who has played poker knows that this is highly unrealistic. Larger stack sizes convey an ability to bully smaller ones, and stack sizes greatly affect what range of hands you can reasonably play.

The point being, even a supercomputer running the most efficient heuristic based poker decision making programs has not yet been able to beat humans in a game that resembles what a real 6 or 9 person table would reflect.

---

Just as reference, on a four-year-old quad-core/8-thread Intel i7-based desktop with 32GB of RAM, to solve a SINGLE hand in PioSolver (the most popular poker solver) from flop through the river takes my machine about 7 minutes. The game tree alone takes up 4 GB of RAM, and in this scenario there are only two players, and each player is restricted to 3 bet sizes.

The idea that this kind of computation can be done on a phone is ludicrous.

1 comments

kevinwang 2446 days ago

Hmmm I could be wrong but I believe it's not true that humans could only bet fixed sizes. Instead, the AI was only pretrained with fixed sizes and had to do some kind of live search algorithm for any size outside of those values, which could be what you're referring to.

Stack sizes were reset to keep the research minimally scoped, taking stack sizes into account likely does not require a quantum leap in research.

This is getting pretty off topic, but the computation could be done online.

cepth 2446 days ago

Yeah, seems like you're correct here.

I went back and re-read the pre-print here (https://www.cs.cmu.edu/~noamb/papers/19-Science-Superhuman.p...). On page 2:

> To reduce the complexity of forming a strategy, Pluribus only considers a few different bet sizes at any given decision point. The exact number of bets it consid-ers varies between one and 14 depending on the situation. Although Pluribus can limit itself to only betting one of a few different sizes between $100 and $10,000, when actually play-ing no-limit poker, the opponents are not constrained to those few options. What happens if an opponent bets $150 while Pluribus has only been trained to consider bets of $100 or $200? Generally, Pluribus will rely on its search algorithm, described in a later section, to compute a response in real time to such “off-tree” actions.

Good catch, and thanks for the correction.

Regarding the effect of stack sizes, I'm not certain on this, but my intuition is that there is some effect on perceived ranges of the other 5 players at the table if stack sizes vary. Since Facebook AI will not be releasing Pluribus code or pre-trained models/weights, we can't be certain, but things like stack-to-pot (SPR) ratio would seem to matter.

Of course, you could always make the argument that human players in a cash game can re-up/refill to the maximum buy-in whenever they're short, but that's another discussion altogether.