Hacker News new | ask | show | jobs
by 0binbrain 3447 days ago
I believe that the key point here is that as adversarial networks compete against each other repeatedly they eventually will converge on a Nash Equilibrium.

Nash Equilibrium from Wikipedia: "Amy is making the best decision she can, taking into account Phil's decision while Phil's decision remains unchanged, and Phil is making the best decision he can, taking into account Amy's decision while Amy's decision remains unchanged."

At the point this happens, a stable solution exists which is almost like saying that we have made the most rational decision given the information we had. Nash was novel in this theory to say that if the participants were rational they would converge on an equilibrium and not get lost in endless recursion. Endless recursion being something like participants endlessly guessing the other opponent move... If Amy knows Phil knows Amy knows Phil knows Amy knows Phil... Obviously it would be problematic for DL to plunge into infinite recursion.

I like to think of the Nash Equilibrium as a balanced teeter totter. When the teeter totter becomes off balanced rational systems will relearn to balance the odds again based the properties of risk (what Nash referred to as utility I believe). Once they do so, they once again converge on an equilibrium.

2 comments

The problem with statement "as adversarial networks compete against each other repeatedly they eventually will converge on a Nash Equilibrium" is that they often won't, and a big problem with training GANs is to ensure that they converge.

Intuitively, what can happen is that one of the problems is simpler to learn than another, and when one of the "players" becomes overwhelmingly good, then the other part of the network stops receiving useful feedback and is unable to find out a direction for improvement, the learning stops. A "sufficiently smart" method would be able to go to a Nash equilibrium even in this case, but current GAN methods can not, so you need to take steps to ensure that it doesn't happen - e.g. extra normalization or less training for the "simpler to learn" component.

It's not enough to train the "adversarial" networks to compete during each inference, on a meta-level you must ensure a form of cooperation to ensure that they are effective teachers for each other during the learning process. For a real life analogy, there's a difference between effective behavior during combat and effective behavior during sparring.

Yeah, makes sense. It seems like when a network stops providing useful adversarial feedback that in and of itself is a learning experience for the smarter network. While it didn't reach an equilibrium it now knows it's at least smarter then the other guy. I feel like it should be able to use that experience of winning to beat the next guy.

It seems like for AI a perfect equilibrium wouldn't be easy to reach, its often hard for the human brain to reach one after all. It'd be the fact that an equilbrium exists though and that it's trained to find it that generates the knowledge along the way. Kinda like a journey not the destination learning method. I'm just theorizing though.

Doesn't the caveat at the end of your citation from Wikipedia prevent endless recursion? Both players assume that the other player will not change their decision.