| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by kijin 4053 days ago

All the complications you mentioned can be, and usually are, modeled as additional constraints on PD.

You can add arbitrary constraints on PD, such as a 50% chance that a third party will punish you for defecting. More importantly, you can play PD many times in a row and have each round's incentive structure depends on the result of previous rounds, sort of like encrypting in CBC mode.

The rewards and penalties don't need to be jail time, either. You can gamble with money, your life, or anything else you value. Usually it's done with some representation of money, because money is easy to measure and more intuitive to people who've never been in a prison.

The iterated (many rounds) variant is extremely powerful, as it allows researchers to simulate all sorts of complicated constraints. For example, other players might become more likely to defect on you if you defect on them three times in a row (three strikes). Certain players (the mob boss) might be much more interested in your performance than others, and defect on you much more severely when disappointed. You might be given an opportunity to reset your records (rehab or pardon) after a certain pattern of defection and cooperation, or maybe it will be game over (death sentence) after a different pattern. Iterate a few million times, and you've got a pretty damn accurate picture of how effective each policy will be in discouraging defection (crime).

Iterated PD also allows researchers to study whether a given incentive structure is stable, i.e. doesn't change much over thousands of iterations. According to the article, the incentives that give rise to police corruption are stable, but tweaking the constraints in a certain way can disrupt them.

More information on iterated PD:

https://en.wikipedia.org/wiki/Prisoner%27s_dilemma#The_itera...