L5 Games

Adversarial Search
[This slide was created by Dan Klein and Pieter Abbeel

for CS188 Intro to AI at UC Berkeley. All CS188 materials are available at http://ai.berkeley.edu.]
Game Playing State-of-theArt
Checkers: 1950: First computer player.

1994: First computer champion: Chinook
ended 40-year-reign of human champion
Marion Tinsley using complete 8-piece
endgame. 2007: Checkers solved!
Chess: 1997: Deep Blue defeats human

champion Gary Kasparov in a six-game
match. Deep Blue examined 200M
positions per second, used very
sophisticated evaluation and undisclosed
methods for extending some lines of
search up to 40 ply. Current programs
are even better, if less historic.
Go: Human champions are now starting

to be challenged by machines, though
the best humans still beat the best
machines. In go, b > 300! Classic
programs use pattern knowledge bases,
but big recent advances use Monte Carlo
(randomized) expansion methods.
Types of Games
Many different kinds of games!
Axes:
Deterministic or stochastic?
One, two, or more players?
Zero sum?
Perfect information (can you see the state)?
Want algorithms for calculating a strategy

(policy) which recommends a move from each
state
Deterministic Games
Many possible formalizations, one is:
States: S (start at s0)

Players: P={1...N} (usually take turns)
Actions: A (may depend on player / state)
Transition Function: SxA S
Terminal Test: S {t,f}
Terminal Utilities: SxP R
Solution for a player is a policy: S A
Zero-Sum Games
Zero-Sum Games
Agents have opposite utilities

(values on outcomes)
Lets us think of a single value that
one maximizes and the other
minimizes
Adversarial, pure competition
General Games
Agents have independent

utilities (values on outcomes)
Cooperation, indifference,
competition, and more are all
possible
Example Tic-Tac-Toe Game
Adversarial Search
(Minimax)
Minimax values:
Deterministic, zero-sum games:
computed
recursively
ma
5
x
Tic-tac-toe, chess, checkers

One player maximizes result
The other minimizes result
Minimax search:
A state-space search tree
Players alternate turns
Compute each nodes minimax
value: the best achievable utility
against a rational (optimal)
adversary
5 min
Terminal
values:
part of the
game
Minimax
Perfect play for deterministic games

Idea: choose move to position with highest minimax
value
= best achievable payoff against best play
E.g., 2-player game:
Minimax Implementation
def max-value(state):
initialize v = -
for each successor of
state:
v = max(v, minvalue(successor))
return v
def min-value(state):
initialize v = +
state:
v = min(v, maxvalue(successor))
return v
Minimax Implementation
def value(state):
if the state is a terminal state: return the
states utility
if the next agent is MAX: return maxvalue(state)
if the next agent is MIN: return min-value(state)
def max-value(state):
initialize v = -
state:
v = max(v,
value(successor))
return v
def min-value(state):
initialize v = +
for each successor
of state:
v = min(v,
value(success
or))
return v
Minimax algorithm
Properties of minimax
Complete? Yes (if tree is finite)

Optimal? Yes (against an optimal opponent)
Time complexity? O(bm)
Space complexity? O(bm) (depth-first exploration)
For chess, b 35, m 100 for "reasonable" games
exact solution completely infeasible
Minimax Properties
max
min
10 10
100
Optimal against a perfect player.

Otherwise?
[Demo: min vs exp (L6D2,
Resource Limits
Problem: In realistic games, cannot search

to leaves!
Solution: Depth-limited search
-2
-1
-2
min
9
Instead, search only to a limited depth in the tree

Replace terminal utilities with an evaluation
function for non-terminal positions
Example:
max
Suppose we have 100 seconds, can explore 10K

nodes / sec
So can check 1M nodes per move
- reaches about depth 8 decent chess
program
Guarantee of optimal play is gone

?
Evaluation Functions
Evaluation functions score non-terminals in depth-limited

search
Ideal function: returns the actual minimax value of the

position
In practice: typically weighted linear sum of features:
e.g. f1(s) = (num white queens num black queens), etc.
Game Tree Pruning
- pruning example
- pruning example
- pruning example
- pruning example
- pruning example
The - algorithm
The - algorithm
Alpha-Beta
Implementation
: MAXs best option on path to root
: MINs best option on path to root
def max-value(state, , ):
initialize v = -
state:
v = max(v,
value(successor,
, ))
if v return v
= max(, v)
return v
def min-value(state , , ):
initialize v = +
state:
v = min(v,
value(successor, ,
))
if v return v
= min(, v)
return v
Alpha-Beta Pruning
Properties
This pruning has no effect on minimax value

computed for the root!
Values of intermediate nodes might be wrong
Important: children of the root may have the wrong value

So the most nave version wont let you do action
selection
Good child ordering improves effectiveness of

pruning
With perfect ordering:
max
Time complexity drops to O(b m/2)

Doubles solvable depth!
Full search of, e.g. chess, is still hopeless
min
10
10
Alpha-Beta Example
Alpha-Beta Example

L5 Games

Caricato da

Informazioni sul documento

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

L5 Games

Caricato da

Copyright:

Formati disponibili

Adversarial Search

[This slide was created by Dan Klein and Pieter Abbeel

Game Playing State-of-theArt

Checkers: 1950: First computer player.

Chess: 1997: Deep Blue defeats human

Go: Human champions are now starting

Many different kinds of games!

Want algorithms for calculating a strategy

Many possible formalizations, one is:

States: S (start at s0)

Solution for a player is a policy: S A

Agents have opposite utilities

Agents have independent

Example Tic-Tac-Toe Game

Tic-tac-toe, chess, checkers

Perfect play for deterministic games

Complete? Yes (if tree is finite)

Optimal against a perfect player.

Problem: In realistic games, cannot search

Instead, search only to a limited depth in the tree

Suppose we have 100 seconds, can explore 10K

Guarantee of optimal play is gone

Evaluation functions score non-terminals in depth-limited

Ideal function: returns the actual minimax value of the

e.g. f1(s) = (num white queens num black queens), etc.

Game Tree Pruning

This pruning has no effect on minimax value

Values of intermediate nodes might be wrong

Important: children of the root may have the wrong value

Good child ordering improves effectiveness of

With perfect ordering:

Time complexity drops to O(b m/2)

Potrebbero piacerti anche