Game Playing
Overview
Game Playing
Game problem vs State space problem
Status labeling procedure in Game tree
Nim game problem
Bounded Lood-Ahead Strategy and Use of Evaluation Functions
Using evaluation functions
MINIMAX procedure
Alpha- Beta Pruning
Typical simple case for a game
What is a Game?
-A sequence of choices where each choice is made
from a number of discrete alternatives.
-Each sequence ends in a certain outcome and every
outcome has a definite value for the opening player.
Typical simple case for a game
Types of Games:
Perfect information games: both the players have access to the
same information about the game in progress.
Ex: Chess, Tic-Tac-Toe
Imperfect information games: players do not have access to
complete information about the game
Ex: cards
Typical simple case for a game
Two player, turn-taking
Players alternate moves
Zero-sum games
one players win is the others loss
Discrete:
Finite number of states or configurations
Perfect information:
Fully observable
No information is hidden from either player.
Deterministic:
The outcome of actions is known
Examples: Tic-Tac-Toe, Chess, Nim, etc.
How to play a game
A way to play such a game is to:
Consider all the legal moves you can make
Compute new position resulting from each move
Evaluate each to determine which is best
Make that move
Wait for your opponent to move and repeat
Game problem vs State space problem
Start state, legal moves, and winning positions (goals)
many games can be formulated as search problems
A game can be represented as a tree
A game tree is an explicit representation of all possible plays of
the game.
MIN nodes:
Select the minimum cost successor
MAX nodes:
Select the maximum cost successor
The leaves represent the terminal positions
At the leaf position, when the game is finished we can assign
utility to the player (WIN, LOSS, DRAW)
Game problem vs State space problem
In game playing involving computers, one player is assumed to
be the computer, while other is a human.
Computer- MAX player
Aim: to make the computer win the game by always making
the best possible move at its turn.
Lookahead at all possible moves by generating the complete
game tree and then decide which move is the best for MAX.
Search Problem Formulation
initial state
board, positions of pieces
whose turn is it
successor function (operators)
list of (move, state)
defines the legal moves, and the resulting states
terminal test
also called goal test
determines when the game is over
calculate the result
usually win, lose, draw; sometimes a score (see below)
utility or payoff function
numeric value for the outcome of a game
Two-Person Games
games with two opposing players
often called MIN and MAX
usually MAX moves first, then they take turns
in game terminology, a move comprises two steps (plies)
one by MAX and one by MIN
MAX wants a strategy to find a winning state
no matter what MIN does
MIN does the same
or at least tries to prevent MAX from winning
full information
both players know the full state of the environment
partial information
one player only knows part of the environment
some aspects may be hidden from the opponent, or from both players
Status labeling procedure in Game tree
The leaf nodes are labelled as WIN, LOSS, or DRAW
each non-terminal node in the game tree can be labelled as WIN, LOSS, or
DRAW by using the bottom-up process.
Status labelling procedure:
If j is a non-terminal MAX node, then
STATUS (j) = WIN, if any of j's successor is a WIN
= LOSS, if all j's successor are LOSS
= DRAW, if any of j's successor is a DRAW and none is WIN.
If j is a non-terminal MIN node, then
STATUS (j) = WIN, if all j's successor are WIN
= LOSS, if any of j's successor is a LOSS
=DRAW, if any of j's successor is a DRAM and none is LOSS.
Game Tree: Example
Nim Game Problem
The Game: There is a single pile of five stones and two players. Moves are
made by the players alternately. In a move, each player removes either one or
two stones from the pile. Player who removes the last stone loses the game.
Nim Game Problem
Bounded Look- Ahead Strategy and use of Evaluation
Functions
Status labelling procedure requires the generation of the
complete game tree.
Producing a complete game tree is only possible for simple
games
When you go for games like chess, then you cannot apply
status labelling procedure because state space is just too big.
we will expand the game tree up to a certain depth, and we
will use some heuristics functions to evaluate the position of
the game after that many look- aheads.
Evaluation function
Evaluation function or static evaluator is used to
evaluate the goodness of a game position
Zero-sum assumption lets us use a single evaluation
function to describe goodness of a board wrt both players
f(n) >> 0: position n good for me and bad for you
f(n) << 0: position n bad for me and good for you
f(n) near 0: position n is a neutral position
f(n) = +infinity: win for me
f(n) = -infinity: win for you
MINIMAX Procedure
The procedure through which the scoring
information travels up the game tree is
called MINIMAX procedure.
MINIMAX Procedure
Minimax procedure
Create start node as a MAX node with current board
configuration
Expand nodes down to some depth (a.k.a. ply) of lookahead
in the game
Apply the evaluation function at each of the leaf nodes
Back up values for each of the non-leaf nodes until a value
is computed for the root node
At MIN nodes, the backed-up value is the minimum of the
values associated with its children.
At MAX nodes, the backed-up value is the maximum of the
values associated with its children.
Pick the operator associated with the child node whose
backed-up value determined the value at the root
Minimax Algorithm
Generate the game tree
Apply the utility function to each terminal state to get its value
Use these values to determine the utility of the nodes one level
higher up in the search tree
From bottom to top
For a max level, select the maximum value of its successors
For a min level, select the minimum value of its successors
From root node select the move which leads to highest value
Minimax Algorithm
2
1
2
2
Static evaluator
value
This is the move
selected by minimax
MAX
MIN
Tic-Tac-Toe
X
e(p) = 6 - 5 = 1
Initial State: Board position of 3x3 matrix with 0 and X.
Operators: Putting 0s or Xs in vacant positions alternatively
Terminal test: Which determines game is over
Utility function:
e(p) = (No. of complete rows, columns or diagonals are
still open for player ) (No. of complete rows, columns or
diagonals are still open for opponent )
Partial Game Tree for Tic-Tac-Toe
f(n) = +1 if the position is a
win for X.
f(n) = -1 if the position is a
win for O.
f(n) = 0 if the position is a
draw.
Alpha-Beta Strategy
Maintain two bounds:
Alpha (): a lower bound on best that the
player to move can achieve
Beta (): an upper bound on what the
opponent can achieve
Search, maintaining and
Whenever higher, or higher further search at this node is
irrelevant
How to Prune the Unnecessary
Path
If beta value of any MIN node below a MAX node is less than or
equal to its alpha value, then prune the path below the MIN
node.
If alpha value of any MAX node below a MIN node exceeds the
beta value of the MIN node, then prune the nodes below the
MAX node.
Alpha-beta pruning
We can improve on the performance of the
minimax algorithm through alpha-beta pruning
Basic idea: If you have an idea that is surely bad,
don't take the time to see how truly awful it is. -Pat Winston
MAX
MIN
>=2
=2
We dont need to compute
the value at this node.
<=1
MAX
2
No matter what it is, it cant
affect the value of the root
node.
Alpha-beta pruning
Traverse the search tree in depth-first order
At each MAX node n, alpha(n) = maximum value found
so far
At each MIN node n, beta(n) = minimum value found so
far
The alpha values start at - and only increase, while beta
values start at + and only decrease
Beta cutoff: Given MAX node n, cut off search below n
(i.e., dont generate/examine any more of ns children) if
alpha(n) >= beta(i) for some MIN node ancestor i of n.
Alpha cutoff: stop searching below MIN node n if
beta(n) <= alpha(i) for some MAX node ancestor i of n.
Alpha-beta general example
3
MAX
MIN
12
14 1 - prune
2 - prune
14
Alpha-beta algorithm
function MAX-VALUE (state, , )
;; = best MAX so far; = best MIN
if TERMINAL-TEST (state) then return UTILITY(state)
v := -
for each s in SUCCESSORS (state) do
v := MAX (v, MIN-VALUE (s, , ))
if v >= then return v
:= MAX (, v)
end
return v
function MIN-VALUE (state, , )
if TERMINAL-TEST (state) then return UTILITY(state)
v :=
for each s in SUCCESSORS (state) do
v := MIN (v, MAX-VALUE (s, , ))
if v <= then return v
:= MIN (, v)
end
return v
Effectiveness of alpha-beta
Alpha-beta is guaranteed to compute the same value for the root
node as computed by minimax, with less or equal computation
Worst case: no pruning, examining bd leaf nodes, where each
node has b children and a d-ply search is performed
Best case: examine only (2b)d/2 leaf nodes.
Result is you can search twice as deep as minimax!
Best case is when each players best move is the first alternative
generated
In Deep Blue, they found empirically that alpha-beta pruning
meant that the average branching factor at each node was about 6
instead of about 35!