|
|
|
|
|
by s-casci
911 days ago
|
|
The policy function outputs the probability of taking every possible (legal or illegal) action. Once you have a way of indexing those actions, both the policy and the game need to refer to the same thing when indexing the same number |
|