Sei sulla pagina 1di 31

Game Theory 

in Cognitive Science

Bonnies Deal
Here is the deal that we are offering both you
and Clyde:
If you confess, and Clyde doesnt, youll go free
and he gets 20 years. On the other hand, if you
dont confess, and he does, then he goes free, and
you get 20 years.
If you both confess, you both get 10 years in
prison.
If you both clam up, well convict you both on a
lesser charge (endangering corn), and you both get
1 year.

Clyde
Clam up
Clam up

Stool Pigeon

1 year
1 year

20 years
0 years

0 years
20 years

10 years
10 years

Bonnie
Stool Pigeon

Prisoners Dilemma: The Formal


Conditions


DC > CC > DD > CD

CC > (DC + CD)/2

Blue
Cooperate
Cooperate

Defect

3
3

0
5

5
0

1
1

Red
Defect

Game Theory: Basic Taxonomy


Zero- vs. non-zero sum
Two- vs. N-person games
Finite vs. infinite number of choices
Iterated vs. non-iterated games
Games of perfect information

Two-Person Zero Sum Games


The notion of a dominant choice
Tricks for reducing game matrices
The solution of a zero-sum game
The value of a game
The minimax theorem

Pure vs. Mixed strategies

Blue
A

Red
1

Blue
8

Red

10

11

Worst I
can do

Blue

Red

Worst I
can do

Scissors

Paper

Stone

Scissors

-1

Paper

-1

-1

Stone

The Minimax Theorem in Game


Theory
Applies to zero-sum, two-person finite games.
The minimax theorem says that in such a game, there is a
value V for the game (the same value V for both players).
Given an optimal strategy (possibly a mixed strategy),
each player can be assured (on average) of obtaining at
least V for the game regardless of what the other player
does.
What this means, essentially, is that both players can
examine such a matrix and determine beforehand (and
regardless of the other players plan) what they need to do
to ensure receiving an average of V for the game.

A Non-Zero-Sum Game
Blue
8
4

5
7

6
2

4
8

Red

Blue
5
6

7
2

3
7

9
8

Red

Clyde
Clam up
Clam up

Stool Pigeon

1 year
1 year

20 years
0 years

0 years
20 years

10 years
10 years

Bonnie
Stool Pigeon

Blue
Cooperate
Cooperate

Defect

3
3

0
5

5
0

1
1

Red
Defect

Blue
6
6

4
7

7
4

-3
-3

Red

Blue
3
3

1
3

3
1

0
0

Red

Blue

4
4

1
3

3
1

0
0

Red

The Ultimatum Game: adding


intangibles to a utility function
Two players, Red and Blue. Red is given
$1000 and is told that he can choose an
amount to be taken out of this total to offer
Blue. Blue then chooses whether to accept
this amount or not.
If Blue rejects Reds offer, both get nothing.

Some Simple AxelrodTournament-like strategies


All-Defect simply defects on every round
Poor-Trusting-Fool simply cooperates on
every round
Random is a test strategy that simply
cooperates or defects randomly
Unforgiving cooperates initially until the
first time that the other player defects; after
that, Unforgiving defects forever

The Tit-for-Tat Strategy


Tit-for-Tat cooperates on the first round.
Thereafter, on every subsequent round, it
simply imitates what the other player did on
the previous round. (That is, Tit-for-Tat
does at round N what the other player did at
round N-1.)

A British staff office on a tour of the trenches


remarked that he was
astonished to observe German soldiers walking
about within rifle range behind their own line. Our
men appeared to take no notice. I privately made up
my mind to do away with that sort of thing when we
took over; such things should not be allowed. These
people evidently did not know there was a war on.
Both sides apparently believed in the policy of live
and let live.
Dugdale 1932, quoted in Axelrod, p. 74

The high commands of the two sides did not


share the view of the common soldier who
said:
The real reason for the quietness of some sections
of the line was that neither side had any intention of
advancing in that particular district... If the British
shelled the Germans, the Germans replied, and the
damage was equal: if the Germans bombed an
advanced piece of trench and killed five
Englishmen, an answering fusillade killed five
Germans.

Belton Cobb 1916,


quoted in Axelrod p. 76

The ethics that developed are illustrated in


this incident, related by a British officer
recalling his experience while facing a Saxon
unit of the German Army.
I was having tea with A Company when we heard a
lot of shouting and went out to investigate. We found
our men and the Germans standing on their respective
parapets. Suddenly a salvo arrived but did no damage.
Naturally both sides got down and our men started
swearing at the Germans, when all at once a brave
German got on to his parapet and shouted out We are
very sorry about that; we hope no one was hurt. It is
not our fault, it is that damned Prussian artillery.

Rutter 1934, quoted in Axelrod, p. 85

Simulating Evolution in a PD
World
Consider a world of animals, each one playing in a PD tournament
against representative rules from the original tournament.
Each animal is completely determined by its actions in the PD game.
Each animal has a memory of the past three rounds only.
Since each round has four possibilities (DC, CC, CD, CC), there are
sixty-four responses that each animal must remember. Thus, each
animal is essentially determined by a 64-entry table (one entry for each
of the possible preceding three-round sets).

PD World, Continued
It seems, then, that we need sixty-four bits to specify an
animal. To be a bit more precise, since each animal must
begin on the first round by making some choice, we include
an additional six bits to specify which of the sixty-four table
choices we will begin with. Thus, each animal requires
seventy bits total.
In each generation, a population of our animals plays against
the representative rules. The best-scoring of these animals are
preferentially allowed to survive to the next generation.
We also include new animals created by mating between
the successful animals. New animals are derived via
crossover (between two animals) and occasional mutations.

Rules that Evolve in the PD


system
Dont rock the boat: continue to cooperate after three
mutual cooperations (cooperate after CC, CC, CC)
Be provocable: defect when the other player defects out of
the blue (defect after CC, CC, CD)
Accept an apology: continue to cooperate after cooperation
has been restored (cooperate after DC, CD, CC)
Forget: cooperate when mutual cooperation has been
restored after an exploitation (cooperate after CD, CC, CC)
Accept a rut: defect after three mutual defections (defect
after DD, DD, DD)

Blue

Red

Blue

Red

Potrebbero piacerti anche