In a game state where your opponent will be choosing the next move, you should select the next state to expand based on a UCB formula involving your opponent's expected payout, not your own.
But this requires storing and back propagating this info for the other players - something I really haven't seen in any examples (nor in this article). We cannot also assume that game is always zero-sum game and this information is not needed.