Hacker News new | ask | show | jobs
by Cybiote 2533 days ago
Adaptability is certainly not necessary (almost by definition) if you're playing a near to equilibrium strategy but adaptability is a useful skill to have in a general non-stationary world.

That said, for this bot, I wouldn't say it's playing completely independent of the other players's interior state. Pluribus must infer its opponents strategy profile and according to the paper, maintains a distribution over possible hole cards and updates its belief according to observed actions. This is part of playing in a minimally exploitable way in such a large space for an imperfect information game.

1 comments

> Pluribus must infer its opponents strategy profile

This is what interests me. It doesn’t do this. In fact because it played against itself only, it is should be assumed that the only strategy profile it considers is its own.

You're right that it uses itself as a prototype for decisions but the fact that it also maintains a probability distribution over possible hole cards and that it updates according to observed actions is already richer than the local decision only approach taking most all other bots. This is sort of forced by the simplicity of poker's action space combined with the large search space and imperfect information. Here, the simplicity ends up making things more difficult! They also use multiple play styles as "continuation strategies" so it's a bit more robust. And to be fair, I suspect much of human play does use themselves and experience as a substitute too.