|
|
|
|
|
by zardo
3761 days ago
|
|
Or, rather than multiple policies, one policy that takes a player vector as an input along with the board position. Players that you predict will make the same move from a given board have their vectors adjusted toward each other and away from a random sample of other player vectors. If it works, you would be able to perform player vector math ala word2vec. (No idea if it will work) |
|