Sei sulla pagina 1di 21

game theory

PDF generated using the open source mwlib toolkit. See http://code.pediapress.com/ for more information. PDF generated at: Sun, 23 Sep 2012 19:24:29 UTC

Contents
Articles
Fair division Prisoner's dilemma Shapley value 1 6 16

References
Article Sources and Contributors Image Sources, Licenses and Contributors 20 21

Article Licenses
License 22

Fair division

Fair division
Fair division, also known as the cake-cutting problem, is the problem of dividing a resource in such a way that all recipients believe that they have received a fair amount. The problem is easier when recipients have different measures of value of the parts of the resource: in the "cake cutting" version, one recipient may like marzipan, another prefers cherries, and so onthen, and only then, the n recipients may get even more than what would be one n-th of the value of the "cake" for each of them. On the other hand, the presence of different measures opens a vast potential for many challenging questions and directions of further research. There are a number of variants of the problem. The definition of 'fair' may simply mean that they get at least their fair proportion, or harder requirements like envy-freeness may also need to be satisfied. The theoretical algorithms mainly deal with goods that can be divided without losing value. The division of indivisible goods, as in for instance a divorce, is a major practical problem. Chore division is a variant where the goods are undesirable. Fair division is often used to refer to just the simplest variant. That version is referred to here as proportional division or simple fair division. Most of what is normally called a fair division is not considered so by the theory because of the use of arbitration. This kind of situation happens quite often with mathematical theories named after real life problems. The decisions in the Talmud on entitlement when an estate is bankrupt reflect some quite complex ideas about fairness,[1] and most people would consider them fair. However they are the result of legal debates by rabbis rather than divisions according to the valuations of the claimants.

Assumptions
Fair division is a mathematical theory based on an idealization of a real life problem. The real life problem is the one of dividing goods or resources fairly between people, the 'players', who have an entitlement to them. The central tenet of fair division is that such a division should be performed by the players themselves, maybe using a mediator but certainly not an arbiter as only the players really know how they value the goods. The theory of fair division provides explicit criteria for various different types of fairness. Its aim is to provide procedures (algorithms) to achieve a fair division, or prove their impossibility, and study the properties of such divisions both in theory and in real life. The assumptions about the valuation of the goods or resources are: Each player has their own opinion of the value of each part of the goods or resources The value to a player of any allocation is the sum of his valuations of each part. Often just requiring the valuations be weakly additive is enough. In the basic theory the goods can be divided into parts with arbitrarily small value. Indivisible parts make the theory much more complex. An example of this would be where a car and a motorcycle have to be shared. This is also an example of where the values may not add up nicely, as either can be used as transport. The use of money can make such problems much easier. The criteria of a fair division are stated in terms of a players valuations, their level of entitlement, and the results of a fair division procedure. The valuations of the other players are not involved in the criteria. Differing entitlements can normally be represented by having a different number of proxy players for each player but sometimes the criteria specify something different.
Berlin divided by the Potsdam Conference

Fair division In the real world of course people sometimes have a very accurate idea of how the other players value the goods and they may care very much about it. The case where they have complete knowledge of each other's valuations can be modeled by game theory. Partial knowledge is very hard to model. A major part of the practical side of fair division is the devising and study of procedures that work well despite such partial knowledge or small mistakes. A fair division procedure lists actions to be performed by the players in terms of the visible data and their valuations. A valid procedure is one that guarantees a fair division for every player who acts rationally according to their valuation. Where an action depends on a player's valuation the procedure is describing the strategy a rational player will follow. A player may act as if a piece had a different value but must be consistent. For instance if a procedure says the first player cuts the cake in two equal parts then the second player chooses a piece, then the first player cannot claim that the second player got more. What the players do is: Agree on their criteria for a fair division Select a valid procedure and follow its rules It is assumed the aim of each player is to maximize the minimum amount they might get, or in other words, to achieve the maximin. Procedures can be divided into finite and continuous procedures. A finite procedure would for instance only involve one person at a time cutting or marking a cake. Continuous procedures involve things like one player moving a knife and the other saying stop. Another type of continuous procedure involves a person assigning a value to every part of the cake.

Criteria for a fair division


There are a number of widely used criteria for a fair division. Some of these conflict with each other but often they can be combined. The criteria described here are only for when each player is entitled to the same amount. A proportional or simple fair division guarantees each player gets his fair share. For instance if three people divide up a cake each gets at least a third by their own valuation. An envy-free division guarantees no-one will want somebody else's share more than their own. An exact division is one where every player thinks everyone received exactly their fair share, no more and no less. An efficient or Pareto optimal division ensures no other allocation would make someone better off without making someone else worse off. The term efficiency comes from the economics idea of the efficient market. A division where one player gets everything is optimal by this definition so on its own this does not guarantee even a fair share. An equitable division is one where the proportion of the cake a player receives by their own valuation is the same for every player. This is a difficult aim as players need not be truthful if asked their valuation.

Fair division

Two players
For two people there is a simple solution which is commonly employed. This is the so-called divide and choose method. One person divides the resource into what they believe are equal halves, and the other person chooses the "half" they prefer. Thus, the person making the division has an incentive to divide as fairly as possible: for if they do not, they will likely receive an undesirable portion. This solution gives a proportional and envy-free division. The article on divide and choose describes why the procedure is not equitable. More complex procedures like the adjusted winner procedure are designed to cope with indivisible goods and to be more equitable in a practical context. Austin's moving-knife procedure[2] gives an exact division for two players. The first player places two knives over the cake such that one knife is at the left side of the cake, and one is further right; half of the cake lies between the knives. He then moves the knives right, always ensuring there is half the cake by his valuation between the knives. If he reaches the right side of the cake, the leftmost knife must be where the rightmost knife started off. The second player stops the knives when he thinks there is half the cake between the knives. There will always be a point at which this happens, because of the intermediate value theorem. The surplus procedure (SP) achieves a form of equitability called proportional equitability. This procedure is strategy proof and can be generalized to more than two people.[3]

Many players
Fair division with three or more players is considerably more complex than the two player case. Proportional division is the easiest and the article describes some procedures which can be applied with any number of players. Finding the minimum number of cuts needed is an interesting mathematical problem. Envy-free division was first solved for the 3 player case in 1960 independently by John Selfridge of Northern Illinois University and John Horton Conway at Cambridge University. The best algorithm uses at most 5 cuts. The Brams-Taylor procedure was the first cake-cutting procedure for four or more players that produced an envy-free division of cake for any number of persons and was published by Steven Brams and Alan Taylor in 1995.[4] This number of cuts that might be required by this procedure is unbounded. A bounded moving knife procedure for 4 players was found in 1997. There are no discrete algorithms for an exact division even for two players, a moving knife procedure is the best that can be done. There are no exact division algorithms for 3 or more players but there are 'near exact' algorithms which are also envy-free and can achieve any desired degree of accuracy. A generalization of the surplus procedure called the equitable procedure (EP) achieves a form of equitability. Equitability and envy-freeness can be incompatible for 3 or more players.[3]

Variants
Some cake-cutting procedures are discrete, whereby players make cuts with a knife (usually in a sequence of steps). Moving-knife procedures, on the other hand, allow continuous movement and can let players call "stop" at any point. A variant of the fair division problem is chore division: this is the "dual" to the cake-cutting problem in which an undesirable object is to be distributed amongst the players. The canonical example is a set of chores that the players between them must do. Note that "I cut, you choose" works for chore division. A basic theorem for many person problems is the Rental Harmony Theorem by Francis Su.[5] An interesting application of the Rental Harmony Theorem can be found in the international trade theory.[6] Sperner's Lemma can be used to get as close an approximation as desired to an envy-free solutions for many players. The algorithm gives a fast and practical way of solving some fair division problems.[7][8][9]

Fair division The division of property, as happens for example in divorce or inheritance, normally contains indivisible items which must be fairly distributed between players, possibly with cash adjustments (such pieces are referred to as atoms). A common requirement for the division of land is that the pieces be connected, i.e. only whole pieces and not fragments are allowed. For example the division of Berlin after World War 2 resulted in four connected parts.[10] A consensus halving is where a number of people agree that a resource has been evenly split in two, this is described in exact division.

History
According to Sol Garfunkel, the cake-cutting problem had been one of the most important open problems in 20th century mathematics,[11] when the most important variant of the problem was finally solved with the Brams-Taylor procedure by Steven Brams and Alan Taylor in 1995. Divide and choose's origins are undocumented. The related activities of bargaining and barter are also ancient. Negotiations involving more than two people are also quite common, the Potsdam Conference is a notable recent example. The theory of fair division dates back only to the end of the second world war. It was devised by a group of Polish mathematicians, Hugo Steinhaus, Bronisaw Knaster and Stefan Banach, who used to meet in the Scottish Caf in Lvov (then in Poland). A proportional (fair division) division for any number of players called 'last-diminisher' was devised in 1944. This was attributed to Banach and Knaster by Steinhaus when he made the problem public for the first time at a meeting of the Econometric Society in Washington D.C. on 17 September 1947. At that meeting he also proposed the problem of finding the smallest number of cuts necessary for such divisions. Envy-free division was first solved for the 3 player case in 1960 independently by John Selfridge of Northern Illinois University and John Horton Conway at Cambridge University, the algorithm was first published in the 'Mathematical Games' column by Martin Gardner in Scientific American. Envy-free division for 4 or more players was a difficult open problem of the twentieth century. The first cake-cutting procedure that produced an envy-free division of cake for any number of persons was first published by Steven Brams and Alan Taylor in 1995. A major advance on equitable division was made in 2006 by Steven J. Brams, Michael A. Jones, and Christian Klamler.[3]

In popular culture
In Numb3rs season 3 episode "One Hour", Charlie talks about the cake-cutting problem as applied to the amount of money a kidnapper was demanding. Hugo Steinhaus wrote about a number of variants of fair division in his book Mathematical Snapshots. In his book he says a special three-person version of fair division was devised by G. Krochmainy in Berdechw in 1944 and another by Mrs L Kott.[12] Martin Gardner and Ian Stewart have both published books with sections about the problem.[13][14] Martin Gardner introduced the chore division form of the problem. Ian Stewart has popularized the fair division problem with his articles in Scientific American and New Scientist. A Dinosaur Comics strip is based on the cake-cutting problem.[15]

Fair division

References
[1] Game Theoretic Analysis of a bankruptcy Problem from the Talmud (http:/ / www. elsevier. com/ framework_aboutus/ Nobel/ Nobel2005/ nobel2005pdfs/ aum16. pdf) Robert J. Aumann and Michael Maschler. Journal of Economic Theory 36, 195-213 (1985) [2] A.K. Austin. Sharing a Cake. Mathematical Gazette 66 1982 [3] Brams, Steven J.; Michael A. Jones and Christian Klamler (December 2006). "Better Ways to Cut a Cake" (http:/ / www. ams. org/ notices/ 200611/ fea-brams. pdf) (PDF). Notices of the American Mathematical Society 53 (11): pp.13141321. . Retrieved 2008-01-16. [4] Brams, Steven J.; Alan D. Taylor (January 1995). "An Envy-Free Cake Division Protocol". The American Mathematical Monthly (Mathematical Association of America) 102 (1): 918. doi:10.2307/2974850. JSTOR2974850. [5] Francis Edward Su (1999). "Rental Harmony: Sperner's Lemma in Fair Division" (http:/ / www. math. hmc. edu/ ~su/ papers. dir/ rent. pdf). Amer. Math. Monthly 106 (10): 930942. doi:10.2307/2589747. . [6] Shiozawa, Y. A (2007). "New Construction ofa Ricardian Trade Theory". Evolutionary and Institutional Economics Review 3 (2): 141187. [7] Francis Edward Su. Cited above. (based on work by Forest Simmons 1980) [8] "The Fair Division Calculator" (http:/ / www. math. hmc. edu/ ~su/ fairdivision/ calc/ ). . [9] Ivars Peterson (March 13, 2000). "A Fair Deal for Housemates" (http:/ / www. maa. org/ mathland/ mathtrek_3_13_00. html). MathTrek. . [10] Steven J. Brams; Alan D. Taylor (1996). Fair division: from cake-cutting to dispute resolution. Cambridge University Press. p.38. ISBN978-0-521-55644-6. [11] Sol Garfunkel. More Equal than Others: Weighted Voting. For All Practical Purposes. COMAP. 1988 [12] Mathematical Snapshots. H.Steinhaus. 1950, 1969 ISBN 0-19-503267-5 [13] aha! Insight. Martin. Gardner, 1978. ISBN ISBN 978-0-7167-1017-2 [14] How to cut a cake and other mathematical conundrums. Ian Stewart. 2006. ISBN 978-0-19-920590-5 [15] http:/ / www. qwantz. com/ archive/ 001345. html

Further reading
Steven J. Brams and Alan D. Taylor (1996). Fair Division - From cake-cutting to dispute resolution Cambridge University Press. ISBN 0-521-55390-3 T.P. Hill (2000). "Mathematical devices for getting a fair share", American Scientist, Vol. 88, 325-331. Jack Robertson and William Webb (1998). Cake-Cutting Algorithms: Be Fair If You Can, AK Peters Ltd, . ISBN 1-56881-076-8.

External links
Short essay about the cake-cutting problem (http://3quarksdaily.blogs.com/3quarksdaily/2005/04/ 3qd_monday_musi.html) by S. Abbas Raza of 3 Quarks Daily. Fair Division (http://www.colorado.edu/education/DMP/fair_division.html) from the Discrete Mathematics Project at the University of Colorado at Boulder. The Fair Division Calculator (http://www.math.hmc.edu/~su/fairdivision/calc/) (Java applet) at Harvey Mudd College Fair Division: Method of Lone Divider (http://www.cut-the-knot.org/Curriculum/SocialScience/LoneDivider. shtml) Fair Division: Method of Markers (http://www.cut-the-knot.org/Curriculum/SocialScience/Markers.shtml) Fair Division: Method of Sealed Bids (http://www.cut-the-knot.org/Curriculum/SocialScience/SealedBids. shtml) Vincent P. Crawford (1987). "fair division," The New Palgrave: A Dictionary of Economics, v. 2, pp.27475. Hal Varian (1987). "fairness," The New Palgrave: A Dictionary of Economics, v. 2, pp.27576. Bryan Skyrms (1996). The Evolution of the Social Contract Cambridge University Press. ISBN 978-0-521-55583-8

Prisoner's dilemma

Prisoner's dilemma
The prisoner's dilemma is a canonical example of a game analyzed in game theory that shows why two individuals might not cooperate, even if it appears that it is in their best interests to do so. It was originally framed by Merrill Flood and Melvin Dresher working at RAND in 1950. Albert W. Tucker formalized the game with prison sentence payoffs and gave it the name "prisoner's dilemma"(Poundstone, 1992). A classic example of the game is presented as follows: Two men are arrested, but the police do not have enough information for a conviction. The police separate the two men, and offer both the same deal: if one testifies against his partner (defects/betrays), and the other remains silent (cooperates with/assists his partner), the betrayer goes free and the one that remains silent gets a one-year sentence. If both remain silent, both are sentenced to only one month in jail on a minor charge. If each 'rats out' the other, each receives a three-month sentence. Each prisoner must choose either to betray or remain silent; the decision of each is kept secret from his partner. What should they do? If it is assumed that each player is only concerned with lessening his own time in jail, the game becomes a non-zero sum game where the two players may either assist or betray the other. The sole concern of the prisoners seems to be increasing his own reward. The interesting symmetry of this problem is that the optimal decision for each is to betray the other, even though they would be better off if they both cooperated. In the classic version of the game, collaboration is dominated by betrayal (i.e. betrayal always produces a better outcome) and so the only possible outcome is for both prisoners to betray the other. Regardless of what the other prisoner chooses, one will always gain a greater payoff by betraying the other. Because betrayal is always more beneficial than cooperation, all purely rational prisoners would seemingly betray the other. However, in reality humans display a systematic bias towards cooperative behavior in this and similar games, much more so than predicted by a theory based only on rational self-interested action.[1][2][3][4][5] There is also an extended "iterative" version of the game, where the classic game is played over and over, and consequently, both prisoners continuously have an opportunity to penalize the other for previous decisions. If the number of times the game will be played is known, the finite aspect of the game means that (by backward induction) the two prisoners will betray each other repeatedly. In casual usage, the label "prisoner's dilemma" may be applied to situations not strictly matching the formal criteria of the classic or iterative games: for instance, those in which two entities could gain important benefits from cooperating or suffer from the failure to do so, but find it merely difficult or expensive, not necessarily impossible, to coordinate their activities to achieve cooperation.

Strategy for the classic prisoners' dilemma


The normal game is shown below:

Prisoner's dilemma

Prisoner B stays silent (cooperates) Prisoner B betrays (defects) Prisoner A stays silent (cooperates) Each serves 1 month Prisoner A: 1 year Prisoner B: goes free Each serves 3 months

Prisoner A betrays (defects)

Prisoner A: goes free Prisoner B: 1 year

Here, regardless of what the other decides, each prisoner gets a higher pay-off by betraying the other. For example, Prisoner A can (according to the payoffs above) state that no matter what prisoner B chooses, prisoner A is better off 'ratting him out' (defecting) than staying silent (cooperating). As a result, based on the payoffs above, prisoner A should logically betray him. The game is symmetric, so Prisoner B should act the same way. Since both rationally decide to defect, each receives a lower reward than if both were to stay quiet. Traditional game theory results in both players being worse off than if each chose to lessen the sentence of his accomplice at the cost of spending more time in jail himself.

Generalized form
The structure of the traditional Prisoners Dilemma can be analyzed by removing its original prisoner setting. Suppose that the two players are represented by colors, red and blue, and that each player chooses to either "Cooperate" or "Defect". If both players play "Cooperate" they both get the payoff A. If Blue plays "Defect" while Red plays "Cooperate" then Blue gets B while Red gets C. Symmetrically, if Blue plays "Cooperate" while Red plays "Defect" then Blue gets payoff C while Red gets payoff B. If both players play "Defect" they both get the payoff D. In terms of general point values:

Canonical PD payoff matrix


Cooperate Defect Cooperate A, A Defect B, C C, B D, D

To be a prisoner's dilemma, the following must be true: B>A>D>C The fact that A>D implies that the "Both Cooperate" outcome is better than the "Both Defect" outcome, while B>A and D>C imply that "Defect" is the dominant strategy for both agents. It is not necessary for a Prisoner's Dilemma to be strictly symmetric as in the above example, merely that the choices which are individually optimal (and strongly dominant) result in an equilibrium which is socially inferior.

Prisoner's dilemma

The iterated prisoners' dilemma


If two players play prisoners' dilemma more than once in succession and they remember previous actions of their opponent and change their strategy accordingly, the game is called iterated prisoners' dilemma. In addition to the general form above, the iterative version also requires that 2A > B + C, to prevent alternating cooperation and defection giving a greater reward than mutual cooperation. The iterated prisoners' dilemma game is fundamental to certain theories of human cooperation and trust. On the assumption that the game can model transactions between two people requiring trust, cooperative behaviour in populations may be modeled by a multi-player, iterated, version of the game. It has, consequently, fascinated many scholars over the years. In 1975, Grofman and Pool estimated the count of scholarly articles devoted to it at over 2,000. The iterated prisoners' dilemma has also been referred to as the "Peace-War game".[6] If the game is played exactly N times and both players know this, then it is always game theoretically optimal to defect in all rounds. The only possible Nash equilibrium is to always defect. The proof is inductive: one might as well defect on the last turn, since the opponent will not have a chance to punish the player. Therefore, both will defect on the last turn. Thus, the player might as well defect on the second-to-last turn, since the opponent will defect on the last no matter what is done, and so on. The same applies if the game length is unknown but has a known upper limit. Unlike the standard prisoners' dilemma, in the iterated prisoners' dilemma the defection strategy is counter-intuitive and fails badly to predict the behavior of human players. Within standard economic theory, though, this is the only correct answer. The superrational strategy in the iterated prisoners' dilemma with fixed N is to cooperate against a superrational opponent, and in the limit of large N, experimental results on strategies agree with the superrational version, not the game-theoretic rational one. For cooperation to emerge between game theoretic rational players, the total number of rounds N must be random, or at least unknown to the players. In this case always defect may no longer be a strictly dominant strategy, only a Nash equilibrium. Amongst results shown by Robert Aumann in a 1959 paper, rational players repeatedly interacting for indefinitely long games can sustain the cooperative outcome.

Strategy for the iterated prisoners' dilemma


Interest in the iterated prisoners' dilemma (IPD) was kindled by Robert Axelrod in his book The Evolution of Cooperation (1984). In it he reports on a tournament he organized of the N step prisoners' dilemma (with N fixed) in which participants have to choose their mutual strategy again and again, and have memory of their previous encounters. Axelrod invited academic colleagues all over the world to devise computer strategies to compete in an IPD tournament. The programs that were entered varied widely in algorithmic complexity, initial hostility, capacity for forgiveness, and so forth. Axelrod discovered that when these encounters were repeated over a long period of time with many players, each with different strategies, greedy strategies tended to do very poorly in the long run while more altruistic strategies did better, as judged purely by self-interest. He used this to show a possible mechanism for the evolution of altruistic behaviour from mechanisms that are initially purely selfish, by natural selection. The best deterministic strategy was found to be tit for tat, which Anatol Rapoport developed and entered into the tournament. It was the simplest of any program entered, containing only four lines of BASIC, and won the contest. The strategy is simply to cooperate on the first iteration of the game; after that, the player does what his or her opponent did on the previous move. Depending on the situation, a slightly better strategy can be "tit for tat with forgiveness." When the opponent defects, on the next move, the player sometimes cooperates anyway, with a small probability (around 15%). This allows for occasional recovery from getting trapped in a cycle of defections. The exact probability depends on the line-up of opponents. By analysing the top-scoring strategies, Axelrod stated several conditions necessary for a strategy to be successful.

Prisoner's dilemma Nice The most important condition is that the strategy must be "nice", that is, it will not defect before its opponent does (this is sometimes referred to as an "optimistic" algorithm). Almost all of the top-scoring strategies were nice; therefore, a purely selfish strategy will not "cheat" on its opponent, for purely self-interested reasons first. Retaliating However, Axelrod contended, the successful strategy must not be a blind optimist. It must sometimes retaliate. An example of a non-retaliating strategy is Always Cooperate. This is a very bad choice, as "nasty" strategies will ruthlessly exploit such players. Forgiving Successful strategies must also be forgiving. Though players will retaliate, they will once again fall back to cooperating if the opponent does not continue to defect. This stops long runs of revenge and counter-revenge, maximizing points. Non-envious The last quality is being non-envious, that is not striving to score more than the opponent (note that a "nice" strategy can never score more than the opponent). The optimal (points-maximizing) strategy for the one-time PD game is simply defection; as explained above, this is true whatever the composition of opponents may be. However, in the iterated-PD game the optimal strategy depends upon the strategies of likely opponents, and how they will react to defections and cooperations. For example, consider a population where everyone defects every time, except for a single individual following the tit for tat strategy. That individual is at a slight disadvantage because of the loss on the first turn. In such a population, the optimal strategy for that individual is to defect every time. In a population with a certain percentage of always-defectors and the rest being tit for tat players, the optimal strategy for an individual depends on the percentage, and on the length of the game. A strategy called Pavlov (an example of Win-Stay, Lose-Switch) cooperates at the first iteration and whenever the player and co-player did the same thing at the previous iteration; Pavlov defects when the player and co-player did different things at the previous iteration. For a certain range of parameters, Pavlov beats all other strategies by giving preferential treatment to co-players which resemble Pavlov. Deriving the optimal strategy is generally done in two ways: 1. Bayesian Nash Equilibrium: If the statistical distribution of opposing strategies can be determined (e.g. 50% tit for tat, 50% always cooperate) an optimal counter-strategy can be derived analytically.[7] 2. Monte Carlo simulations of populations have been made, where individuals with low scores die off, and those with high scores reproduce (a genetic algorithm for finding an optimal strategy). The mix of algorithms in the final population generally depends on the mix in the initial population. The introduction of mutation (random variation during reproduction) lessens the dependency on the initial population; empirical experiments with such systems tend to produce tit for tat players (see for instance Chess 1988), but there is no analytic proof that this will always occur. Although tit for tat is considered to be the most robust basic strategy, a team from Southampton University in England (led by Professor Nicholas Jennings [8] and consisting of Rajdeep Dash, Sarvapali Ramchurn, Alex Rogers, Perukrishnen Vytelingum) introduced a new strategy at the 20th-anniversary iterated prisoners' dilemma competition, which proved to be more successful than tit for tat. This strategy relied on cooperation between programs to achieve the highest number of points for a single program. The University submitted 60 programs to the competition, which were designed to recognize each other through a series of five to ten moves at the start. Once this recognition was made, one program would always cooperate and the other would always defect, assuring the maximum number of points for the defector. If the program realized that it was playing a non-Southampton player, it

Prisoner's dilemma would continuously defect in an attempt to minimize the score of the competing program. As a result,[9] this strategy ended up taking the top three positions in the competition, as well as a number of positions towards the bottom. This strategy takes advantage of the fact that multiple entries were allowed in this particular competition and that the performance of a team was measured by that of the highest-scoring player (meaning that the use of self-sacrificing players was a form of minmaxing). In a competition where one has control of only a single player, tit for tat is certainly a better strategy. Because of this new rule, this competition also has little theoretical significance when analysing single agent strategies as compared to Axelrod's seminal tournament. However, it provided the framework for analysing how to achieve cooperative strategies in multi-agent frameworks, especially in the presence of noise. In fact, long before this new-rules tournament was played, Richard Dawkins in his book The Selfish Gene pointed out the possibility of such strategies winning if multiple entries were allowed, but he remarked that most probably Axelrod would not have allowed them if they had been submitted. It also relies on circumventing rules about the prisoners' dilemma in that there is no communication allowed between the two players. When the Southampton programs engage in an opening "ten move dance" to recognize one another, this only reinforces just how valuable communication can be in shifting the balance of the game.

10

Continuous iterated prisoners' dilemma


Most work on the iterated prisoners' dilemma has focused on the discrete case, in which players either cooperate or defect, because this model is relatively simple to analyze. However, some researchers have looked at models of the continuous iterated prisoners' dilemma, in which players are able to make a variable contribution to the other player. Le and Boyd[10] found that in such situations, cooperation is much harder to evolve than in the discrete iterated prisoners' dilemma. The basic intuition for this result is straightforward: in a continuous prisoners' dilemma, if a population starts off in a non-cooperative equilibrium, players who are only marginally more cooperative than non-cooperators get little benefit from assorting with one another. By contrast, in a discrete prisoners' dilemma, tit for tat cooperators get a big payoff boost from assorting with one another in a non-cooperative equilibrium, relative to non-cooperators. Since nature arguably offers more opportunities for variable cooperation rather than a strict dichotomy of cooperation or defection, the continuous prisoners' dilemma may help explain why real-life examples of tit for tat-like cooperation are extremely rare in nature (ex. Hammerstein[11]) even though tit for tat seems robust in theoretical models.

Real-life examples
These particular examples, involving prisoners and bag switching and so forth, may seem contrived, but there are in fact many examples in human interaction as well as interactions in nature that have the same payoff matrix. The prisoner's dilemma is therefore of interest to the social sciences such as economics, politics and sociology, as well as to the biological sciences such as ethology and evolutionary biology. Many natural processes have been abstracted into models in which living beings are engaged in endless games of prisoner's dilemma. This wide applicability of the PD gives the game its substantial importance.

Prisoner's dilemma

11

In environmental studies
In environmental studies, the PD is evident in crises such as global climate change. It is argued all countries will benefit from a stable climate, but any single country is often hesitant to curb CO2 emissions. The immediate benefit to an individual country to maintain current behavior is perceived to be greater than the purported eventual benefit to all countries if behavior was changed, therefore explaining the current impasse concerning climate change.[12]

In psychology
In addiction research/behavioral economics, George Ainslie points out[13] that addiction can be cast as an intertemporal PD problem between the present and future selves of the addict. In this case, defecting means relapsing, and it is easy to see that not defecting both today and in the future is by far the best outcome, and that defecting both today and in the future is the worst outcome. The case where one abstains today but relapses in the future is clearly a bad outcomein some sense the discipline and self-sacrifice involved in abstaining today have been "wasted" because the future relapse means that the addict is right back where he started and will have to start over (which is quite demoralizing, and makes starting over more difficult). The final case, where one engages in the addictive behavior today while abstaining "tomorrow" will be familiar to anyone who has struggled with an addiction. The problem here is that (as in other PDs) there is an obvious benefit to defecting "today", but tomorrow one will face the same PD, and the same obvious benefit will be present then, ultimately leading to an endless string of defections. John Gottman in his research described in "the science of trust" defines good relationships as those where partners know not to enter the (D,D) cell or at least not to get dynamically stuck there in a loop.

In economics
Advertising is sometimes cited as a real life example of the prisoners dilemma. When cigarette advertising was legal in the United States, competing cigarette manufacturers had to decide how much money to spend on advertising. The effectiveness of Firm As advertising was partially determined by the advertising conducted by Firm B. Likewise, the profit derived from advertising for Firm B is affected by the advertising conducted by Firm A. If both Firm A and Firm B chose to advertise during a given period the advertising cancels out, receipts remain constant, and expenses increase due to the cost of advertising. Both firms would benefit from a reduction in advertising. However, should Firm B choose not to advertise, Firm A could benefit greatly by advertising. Nevertheless, the optimal amount of advertising by one firm depends on how much advertising the other undertakes. As the best strategy is dependent on what the other firm chooses there is no dominant strategy, which makes it slightly different than a prisoner's dilemma. The outcome is similar, though, in that both firms would be better off were they to advertise less than in the equilibrium. Sometimes cooperative behaviors do emerge in business situations. For instance, cigarette manufacturers endorsed the creation of laws banning cigarette advertising, understanding that this would reduce costs and increase profits across the industry.[14] This analysis is likely to be pertinent in many other business situations involving advertising. Another example of the prisoner's dilemma in economics is competition-oriented objectives. [15] When firms are aware of the activities of their competitors, they tend to pursue policies that are designed to oust their competitors as opposed to maximizing the performance of the firm. This approach impedes the firm from functioning at its maximum capacity because it limits the scope of the strategies employed by the firms. Without enforceable agreements, members of a cartel are also involved in a (multi-player) prisoners' dilemma.[16] 'Cooperating' typically means keeping prices at a pre-agreed minimum level. 'Defecting' means selling under this minimum level, instantly taking business (and profits) from other cartel members. Anti-trust authorities want potential cartel members to mutually defect, ensuring the lowest possible prices for consumers.

Prisoner's dilemma

12

Multiplayer dilemmas
Many real-life dilemmas involve multiple players. Although metaphorical, Hardin's tragedy of the commons may be viewed as an example of a multi-player generalization of the PD: Each villager makes a choice for personal gain or restraint. The collective reward for unanimous (or even frequent) defection is very low payoffs (representing the destruction of the "commons"). The commons are not always exploited: William Poundstone, in a book about the prisoner's dilemma (see References below), describes a situation in New Zealand where newspaper boxes are left unlocked. It is possible for people to take a paper without paying (defecting) but very few do, feeling that if they do not pay then neither will others, destroying the system. Subsequent research by Elinor Ostrom, winner of the 2009 Sveriges Riksbank Prize in Economic Sciences in Memory of Alfred Nobel, hypothesized that the tragedy of the commons is oversimplified, with the negative outcome influenced by outside influences. Without complicating pressures, groups communicate and manage the commons among themselves for their mutual benefit, enforcing social norms to preserve the resource and achieve the maximum good for the group, an example of effecting the best case outcome for PD.[17][18]

The Cold War


The Cold War can be modelled as a Prisoner's Dilemma situation. During the Cold War the opposing alliances of NATO and the Warsaw Pact both had the choice to arm or disarm. From each side's point of view: Disarming whilst your opponent continues to arm would have led to military inferiority and possible annihilation. If both sides chose to arm, neither could afford to attack each other, but at the high cost of maintaining and developing a nuclear arsenal. If both sides chose to disarm, war would be avoided and there would be no costs. If your opponent disarmed while you continue to arm, then you achieve superiority. Although the 'best' overall outcome is for both sides to disarm, the rational course for both sides is to arm. This is indeed what happened, and both sides poured enormous resources in to military research and armament for the next thirty years until the dissolution of the Soviet Union broke the deadlock.

Related games
Closed-bag exchange
Hofstadter[19] once suggested that people often find problems such as the PD problem easier to understand when it is illustrated in the form of a simple game, or trade-off. One of several examples he used was "closed bag exchange": Two people meet and exchange closed bags, with the understanding that one of them contains money, and the other contains a purchase. Either player can choose to honor the deal by putting into his or her bag what he or she agreed, or he or she can defect by handing over an empty bag. In this game, defection is always the best course, implying that rational agents will never play. However, in this case both players cooperating and both players defecting actually give the same result, assuming there are no gains from trade, so chances of mutual cooperation, even in repeated games, are few.

Friend or Foe?
Friend or Foe? is a game show that aired from 2002 to 2005 on the Game Show Network in the United States. It is an example of the prisoner's dilemma game tested on real people, but in an artificial setting. On the game show, three pairs of people compete. When a pair is eliminated, they play a game similar to the prisoner's dilemma to determine how the winnings are split. If they both cooperate (Friend), they share the winnings 5050. If one cooperates and the other defects (Foe), the defector gets all the winnings and the cooperator gets nothing. If both defect, both leave with nothing. Notice that the payoff matrix is slightly different from the standard one given above, as the payouts for the "both defect" and the "cooperate while the opponent defects" cases are identical. This makes the "both defect" case a

Prisoner's dilemma weak equilibrium, compared with being a strict equilibrium in the standard prisoner's dilemma. If you know your opponent is going to vote Foe, then your choice does not affect your winnings. In a certain sense, Friend or Foe has a payoff model between prisoner's dilemma and the game of Chicken. The payoff matrix is
Cooperate Defect Cooperate 1, 1 Defect 2, 0 0, 2 0, 0

13

This payoff matrix has also been used on the British television programmes Trust Me, Shafted, The Bank Job and Golden Balls, and on the American show Bachelor Pad. Game data from the Golden Balls series has been analyzed by a team of economists, who found that cooperation was "surprisingly high" for amounts of money that would seem consequential in the real world, but were comparatively low in the context of the game.[20]

Iterated Snowdrift
A modified version of the PD modifies the payoff matrix to reduce the risk of cooperation in the case of partner defection. This may better reflect real world scenarios: "For example two scientists collaborating on a report would benefit if the other worked harder. But when your collaborator doesnt do any work, its probably better for you to do all the work yourself. Youll still end up with a completed project."[21]

Example Snowdrift Payouts (A, B)


A cooperates A defects B cooperates B defects 200, 200 100, 300 300, 100 0, 0

Example PD Payouts (A, B)


A cooperates A defects B cooperates B defects 200, 200 -100, 300 300, -100 0, 0

Notes
[1] Fehr E, Fischbacher U. 2003. The nature of human altruism. Nature 425:785791. [2] Tversky A. 2004. Preference, belief, and similarity: selected writings. Cambridge: MIT Press. [3] Ahn TK, Ostrom E, Walker J. 2003. Incorporating motivational heterogeneity into game theoretic models of collective action. Public Choice 117:295314. [4] Oosterbeek H, Sloof R, van de Kuilen G. 2004. Differences in ultimatum game experiments: evidence from a meta-analysis. Exp Econ 7:171188. [5] Camerer C. 2003. Behavioral game theory. Princeton: Princeton University Press. [6] Shy, O., 1996, Industrial Organization: Theory and Applications, Cambridge, Mass.: The MIT Press. [7] For example see the 2003 study Bayesian Nash equilibrium; a statistical test of the hypothesis (http:/ / econ. hevra. haifa. ac. il/ ~mbengad/ seminars/ whole1. pdf) for discussion of the concept and whether it can apply in real economic or strategic situations (from Tel Aviv University). [8] http:/ / www. ecs. soton. ac. uk/ ~nrj [9] The 2004 Prisoners' Dilemma Tournament Results (http:/ / www. prisoners'-dilemma. com/ results/ cec04/ ipd_cec04_full_run. html) show University of Southampton's strategies in the first three places, despite having fewer wins and many more losses than the GRIM strategy. (Note that in a PD tournament, the aim of the game is not to win matches that can easily be achieved by frequent defection). It should also be pointed out that even without implicit collusion between software strategies (exploited by the Southampton team) tit for tat is not

Prisoner's dilemma
always the absolute winner of any given tournament; it would be more precise to say that its long run results over a series of tournaments outperform its rivals. (In any one event a given strategy can be slightly better adjusted to the competition than tit for tat, but tit for tat is more robust). The same applies for the tit for tat with forgiveness variant, and other optimal strategies: on any given day they might not 'win' against a specific mix of counter-strategies.An alternative way of putting it is using the Darwinian ESS simulation. In such a simulation, tit for tat will almost always come to dominate, though nasty strategies will drift in and out of the population because a tit for tat population is penetrable by non-retaliating nice strategies, which in turn are easy prey for the nasty strategies. Richard Dawkins showed that here, no static mix of strategies form a stable equilibrium and the system will always oscillate between bounds. [10] Le, S. and R. Boyd (2007) "Evolutionary Dynamics of the Continuous Iterated Prisoners' Dilemma" Journal of Theoretical Biology, Volume 245, 258267. [11] Hammerstein, P. (2003). Why is reciprocity so rare in social animals? A protestant appeal. In: P. Hammerstein, Editor, Genetic and Cultural Evolution of Cooperation, MIT Press. pp. 8394. [12] "Markets & Data" (http:/ / www. economist. com/ finance/ displaystory. cfm?story_id=9867020). The Economist. 2007-09-27. . [13] George Ainslie (2001). Breakdown of Will. ISBN0-521-59694-7. [14] This argument for the development of cooperation through trust is given in The Wisdom of Crowds , where it is argued that long-distance capitalism was able to form around a nucleus of Quakers, who always dealt honourably with their business partners. (Rather than defecting and reneging on promises a phenomenon that had discouraged earlier long-term unenforceable overseas contracts). It is argued that dealings with reliable merchants allowed the meme for cooperation to spread to other traders, who spread it further until a high degree of cooperation became a profitable strategy in general commerce [15] J. Scott Armstrong and Kesten C. Greene (2007). "Competitor-oriented Objectives: The Myth of Market Share" (http:/ / marketing. wharton. upenn. edu/ documents/ research/ CompOrientPDF 11-27 (2). pdf). International Journal of Business 12 (1): 116134. . [16] Nicholson, Walter (2000). Intermediate Microeconomics (8th ed.). Harcourt. [17] esky. "Tragedy of the commons - Wikipedia, the free encyclopedia" (http:/ / en. wikipedia. org/ wiki/ Tragedy_of_the_commons). En.wikipedia.org. . Retrieved 2011-12-17. [18] "The Volokh Conspiracy Elinor Ostrom and the Tragedy of the Commons" (http:/ / volokh. com/ 2009/ 10/ 12/ elinor-ostrom-and-the-tragedy-of-the-commons/ ). Volokh.com. 2009-10-12. . Retrieved 2011-12-17. [19] Hofstadter, Douglas R. (1985). Metamagical Themas: questing for the essence of mind and pattern. Bantam Dell Pub Group. ISBN0-465-04566-9. see Ch.29 The Prisoner's Dilemma Computer Tournaments and the Evolution of Cooperation. [20] Van den Assem, Martijn J. (January 2012). "Split or Steal? Cooperative Behavior When the Stakes Are Large" (http:/ / ssrn. com/ abstract=1592456). Management Science 58 (1): 2-20. . [21] Kmmerli, Rolf. "'Snowdrift' game tops 'Prisoner's Dilemma' in explaining cooperation" (http:/ / phys. org/ news111145481. html). . Retrieved 11 April 2012.

14

References
Robert Aumann, Acceptable points in general cooperative n-person games, in R. D. Luce and A. W. Tucker (eds.), Contributions to the Theory 23 of Games IV, Annals of Mathematics Study 40, 287324, Princeton University Press, Princeton NJ. Axelrod, R. (1984). The Evolution of Cooperation. ISBN 0-465-02121-2 Bicchieri, Cristina (1993). Rationality and Coordination. Cambridge University Press. Kenneth Binmore, Fun and Games. David M. Chess (1988). Simulating the evolution of behavior: the iterated prisoners' dilemma problem. Complex Systems, 2:663670. Dresher, M. (1961). The Mathematics of Games of Strategy: Theory and Applications Prentice-Hall, Englewood Cliffs, NJ. Flood, M.M. (1952). Some experimental games. Research memorandum RM-789. RAND Corporation, Santa Monica, CA. Kaminski, Marek M. (2004) Games Prisoners Play (http://webfiles.uci.edu/mkaminsk/www/book.html) Princeton University Press. ISBN 0-691-11721-7 Poundstone, W. (1992) Prisoner's Dilemma Doubleday, NY NY. Greif, A. (2006). Institutions and the Path to the Modern Economy: Lessons from Medieval Trade. Cambridge University Press, Cambridge, UK. Rapoport, Anatol and Albert M. Chammah (1965). Prisoner's Dilemma. University of Michigan Press. S. Le and R. Boyd (2007) "Evolutionary Dynamics of the Continuous Iterated Prisoner's Dilemma" Journal of Theoretical Biology, Volume 245, 258267. Full text (http://letuhuy.bol.ucla.edu/academic/

Prisoner's dilemma cont_ipd_Le_Boyd_JTB.pdf) A. Rogers, R. K. Dash, S. D. Ramchurn, P. Vytelingum and N. R. Jennings (2007) Coordinating team players within a noisy iterated Prisoners Dilemma tournament (http://users.ecs.soton.ac.uk/nrj/download-files/ tcs07.pdf) Theoretical Computer Science 377 (13) 243259. M.J. van den Assem, D. van Dolder and R.H. Thaler (2010). "Split or Steal? Cooperative Behavior When the Stakes are Large" (http://ssrn.com/abstract=1592456)

15

Further reading
Bicchieri, Cristina and Mitchell Green (1997) "Symmetry Arguments for Cooperation in the Prisoner's Dilemma", in G. Holmstrom-Hintikka and R. Tuomela (eds.), Contemporary Action Theory: The Philosophy and Logic of Social Action, Kluwer. Iterated Prisoner's Dilemma Bibliography web links (http://aleph0.clarku.edu/~djoyce/Moth/webrefs.html), July, 2005. Plous, S. (1993). Prisoner's Dilemma or Perceptual Dilemma? Journal of Peace Research, Vol. 30, No. 2, 163179.

External links
Iterated Prisoners Dilemma contains strategies that dominate any evolutionary opponent (http://www.pnas.org/ content/early/2012/05/16/1206569109.full.pdf+html) Prisoner's Dilemma (Stanford Encyclopedia of Philosophy) (http://plato.stanford.edu/entries/ prisoner-dilemma/) Effects of Tryptophan Depletion on the Performance of an Iterated Prisoner's Dilemma Game in Healthy Adults (http://www.nature.com/npp/journal/v31/n5/full/1300932a.html) Nature Neuropsychopharmacology Is there a "dilemma" in Prisoner's Dilemma? (http://www.egwald.ca/operationsresearch/prisonersdilemma. php) by Elmer G. Wiens "Games Prisoners Play" (http://webfiles.uci.edu/mkaminsk/www/book.html) game-theoretic analysis of interactions among actual prisoners, including PD. Iterated prisoner's dilemma game (http://www.iterated-prisoners-dilemma.net/) Another version of the Iterated prisoner's dilemma game (http://kane.me.uk/ipd/) Another version of the Iterated prisoner's dilemma game (http://www.gametheory.net/Web/PDilemma/) Iterated prisoner's dilemma game (http://www.paulspages.co.uk/hmd/) applied to Big Brother TV show situation. The Bowerbird's Dilemma (http://www.msri.org/ext/larryg/pages/15.htm) The Prisoner's Dilemma in ornithology mathematical cartoon by Larry Gonnick. Examples of Prisoners' dilemma (http://www.economics.li/downloads/egefdile.pdf) Multiplayer game based on prisoner dilemma (http://www.gohfgl.com/) Play prisoner's dilemma over IRC by Axiologic Research. Prisoner's Dilemma Party Game (http://fortwain.com/pddg.html) A party game based on the prisoner's dilemma The Edge cites Robert Axelrod's book and discusses the success of U2 following the principles of IPD. (http:// www.rte.ie/tv/theview/archive/20080331.html) Classical and Quantum Contents of Solvable Game Theory on Hilbert Space (http://arxiv.org/abs/quant-ph/ 0503233v2) "Radiolab: "The Good Show"" (http://www.radiolab.org/2010/dec/14/). episode 1. season 9. December 14,2011. WNYC.

Shapley value

16

Shapley value
In game theory, the Shapley value, named in honour of Lloyd Shapley, who introduced it in 1953, is a solution concept in cooperative game theory.[1][2] To each cooperative game it assigns a unique distribution (among the players) of a total surplus generated by the coalition of all players. The Shapley value is characterized by a collection of desirable properties or axioms described below. Hart (1989) provides a survey of the subject.[3] [4] The setup is as follows: a coalition of players cooperates, and obtains a certain overall gain from that cooperation. Since some players may contribute more to the coalition than others or may possess different bargaining power (for example threatening to destroy the whole surplus), what final distribution of generated surplus among the players should we expect to arise in any particular game? Or phrased differently: how important is each player to the overall cooperation, and what payoff can he or she reasonably expect? The Shapley value provides one possible answer to this question.

Formal definition
To formalize this situation, we use the notion of a coalitional game: we start out with a set N (of n players) and a function with , where denotes the empty set. The function that maps subsets of players to reals is called a characteristic function. The function has the following meaning: if S is a coalition of players, then v(S), called the worth of coalition S, describes the total expected sum of payoffs the members of can obtain by cooperation. The Shapley value is one way to distribute the total gains to the players, assuming that they all collaborate. It is a "fair" distribution in the sense that it is the only distribution with certain desirable properties to be listed below. According to Shapley value the amount that player i gets given a coalitional game is

where n is the total number of players and the sum extends over all subsets S of N not containing player i. The formula can be interpreted as follows. Imagine the coalition being formed one actor at a time, with each actor demanding their contribution v(S{i}) v(S) as a fair compensation, and then averaging over the possible different permutations in which the coalition can be formed. An alternative equivalent formula for the Shapley value is:

where the sum ranges over all in the order .

orders

of the players and

is the set of players in

which precede

Example
Consider a simplified description of a business. We have an owner o, who does not work but provides the crucial capital, meaning that without him no gains can be obtained. Then we have k workers w1,...,wk, each of whom contributes an amount p to the total profit. So N = {o, w1,...,wk} and v(S) = 0 if o is not a member of S and v(S) = mp if S contains the owner and m workers. Computing the Shapley value for this coalition game leads to a value of mp/2 for the owner and p/2 for each worker.

Shapley value

17

Glove game
The glove game is a coalitional game where the players have left and right hand gloves and the goal is to form pairs.

where players 1 and 2 have right hand gloves and player 3 has a left hand glove The value function for this coalitional game is

Where the formula for calculating the Shapley value is:

Where

is an ordering of the players and

is the set of players in

which precede

in the order

The following table displays the marginal contributions of Player 1


Order MC_1

By a symmetry argument it can be shown that

Due to the efficiency axiom we know that the sum of all the Shapley values is equal to 1, which means that

Properties
The Shapley value has the following desirable properties: 1. Efficiency: The total gain is distributed:

2. Symmetry: If i and j are two actors who are equivalent in the sense that

for every subset S of N which contains neither i nor j, then i(v) = j(v). 3. Additivity: if we combine two coalition games described by gain functions v and w, then the distributed gains should correspond to the gains derived from v and the gains derived from w:

for every i inN.

Image Sources, Licenses and Contributors

21

Image Sources, Licenses and Contributors


Image:Berlin Blockade-map.svg Source: http://en.wikipedia.org/w/index.php?title=File:Berlin_Blockade-map.svg License: Creative Commons Attribution-ShareAlike 3.0 Unported Contributors: historicair 23:55, 11 September 2007 (UTC)

License

22

License
Creative Commons Attribution-Share Alike 3.0 Unported //creativecommons.org/licenses/by-sa/3.0/

Potrebbero piacerti anche