0% found this document useful (0 votes)
4 views3 pages

03 Adversarial Search

The document discusses optimal decision-making in competitive games, focusing on adversarial search problems and the minimax algorithm for determining optimal moves. It introduces alpha-beta pruning to improve efficiency in minimax searches and explores imperfect real-time decisions using heuristic evaluation functions. Additionally, it addresses stochastic games that incorporate chance nodes, leading to the concept of expecti-minimax values for decision-making under uncertainty.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views3 pages

03 Adversarial Search

The document discusses optimal decision-making in competitive games, focusing on adversarial search problems and the minimax algorithm for determining optimal moves. It introduces alpha-beta pruning to improve efficiency in minimax searches and explores imperfect real-time decisions using heuristic evaluation functions. Additionally, it addresses stochastic games that incorporate chance nodes, leading to the concept of expecti-minimax values for decision-making under uncertainty.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
You are on page 1/ 3

*************************

- GAMES -
*************************
[~] Cover competitive environment, in which the agent's goals are in conflict,
giving rise to adversarial search problems.
[~] Begin with a definition of optimal move, algorithm to find it. Then look at
technique for choosing a good move in limited time.
[-] Prunning allows us to ignore portions of search tree that makes no
difference to the final choice.
[-] Heuristic evaluation functions allow us to approximate the true utility of
the state without doing complete search.
[~] Consider 2-player games, MAX and MIN. Game can be defined as:
[-] S(0): initial state
[-] Player(S): defines which player has the move in state S
[-] Action(S): set of legal moves in a state
[-] Result(S, a): transistion model, defines result of a move
[-] Terminal-Test(S): true when game is over.
[-] Utility(S, p): utility function defines final numeric value for game that
ends in terminal state S for player p. (e.g., outcome: 1-win, 0-lose, 1/2-draw)

**********************************************
- OPTIMAL DECISIONS IN GAMES -
**********************************************
[~] In game parlance, we say that a tree is one move deep, consists of two half
moves from each player, each of which is called a ply.
[~] Optimal strategy can be determined from the minimax value of each node,
MINIMAX(n) - utility of being in the corresponding state, assume both plays
optimally.
[-] Minimax of a terminal state is its utility.
[-] MINIMAX(S) = UTILITY(S) if TERMINAL-TEST(S)
max(..a) MINIMAX(RESULT(S, a)) if PLAYER(S) = MAX
min(..a) MINIMAX(RESULT(S, a)) if PLAYER(S) = MIN
[~] The minimax algorithm: computes the minimax decision from the current state.
[-] uses a simple recursive computation of the minimax values of each successor
state.
[-] minimax values are backed up through the tree as the recursion unwinds.
[-] performs a complete depth-first exploration of the game tree => time
complexity: O(b^m) - b: num legal moves at each point, depth: m
[~] For multiplayers, replace single value for each node with a vector of values.
[-] The backed-up value of a node n is always the utility vector of the
successor state with the highest value for the player choosing at n.
[-] Usually involve alliances: Collaboration emerges from purely selfish
behavior.

**************************************
- ALPHA-BETA PRUNING -
**************************************
[~] Problem with minimax search: number of game states is exponential in the depth
of the tree.
[-] Cannot eliminate the exponential, but can cut it in half.
[-] Possible to compute correct minimax decision without looking at all nodes.
=> Alpha-beta pruning.
[~] Alpha-beta pruning can be applied to trees of any depth, usually possible to
prune entire subtrees.
[~] General principle: consider a node n, such that Player has a choice of moving
to that node.
[-] If Player has a better choice m before either at parent node of n or any
choice point further up, n will never be reached in actual play.
[~] alpha: value of the best choice found so far at any choice point along the path
for MAX
beta: value of the best choice found so far at any choice point along the path
for MIN
[-] Alpha-beta search updates the value of alpha-beta (of each node) as it goes
along and prunes remaining branches at a node as soon as the value of current node
is known to be worse than current alpha or beta value for MAX and MIN,
respectively.
[~] Move-ordering: effectiveness of alpha-beta pruning is highly dependent on the
examined order of the states.
[-] Might be worthwhile to examine first the succesors that are likely to be
the best.`
[-] If can be done, time complexity reduce to O(b^(m/2)), branching factor
becomes sqrt(b) instead of b.
[+] alpha-beta can solve a tree twice as deep as minimax in same amount of
time.
[-] Can add dynamic move-ordering schemes - trying first the moves that were
found to be best in the past
[+] Can apply iterative deepening search: first 1 ply, then 1 ply
deeper,...
[-] Repeated states may occur frequently due to transpositions - different
permutations but yields same result.
[+] Worthwhile to store the evaluation of resulting position in hash table
the first time encountered => transpotition table.

*************************************************
- IMPERFECT REAL-TIME DECISIONS -
*************************************************
[~] Should cut off the search earlier and apply heuristics evaluation function to
states
[-] Replace utility function by a heuristic evaluation function EVAL -
estimates the position's utility.
[-] Replace terminal test by a cutoff test - decides when to apply EVAL.
[~] Evaluation function: estimate expected utility of game from given position.
[-] First, evaluation func should order terminal states same way as the true
utility func does: win states must be evaluated better than draws.
[-] Second, computational time is not too long !
[-] Finally, for nonterminal states, evaluation func should be strongly
correlated with the actual chances of winning.
[-] Most EVAL computer separate numberical contributions from each feature and
then combine them to find total value.
[+] Mathematically called weighted linear function:
EVAL(s) = w1f1(s) + w2f2(s) + ... wnfn(s), fi(s): feature of state, wi:
weight of corresponding feature.
[+] Can lead to errors due to approximate nature of evaluation function.
[~] Need more sophisticated cutoff test:
[-] EVAL func should only be applied to quiescent positions - unlikely to
exhibit wild swings in value in the near future.
[-] Horizon effect - arises when facing opponent's move that causes serious
damage and unavoidable.
[+] Can be temporarily avoided by delayed tactics.
[+] Can mitigate horizon effect by singular extension - move that "clearly
better" than others.
[-] Forward pruning: prun some moves without consideration
[+] Can use beam search
[+] Dangerous => use ProbCut (based on gained experience and statistics).
[-] Lookup table: rather than search for opening and ending.

************************************
- STOCHASTIC GAMES -
************************************
[~] Unpredictable external events => put us into unforeseen situations.
=> stochastic games.
[~] Must include chance nodes in addition to MAX and MIN nodes.
[-] Branches leading from each chance nodes denote possible dice rolls, for ex.
[~] Need to make correct decisions, but positions don't have definite minimax
values.
[-] But we can calculate expected value.
[-] This leads us to generalize the minimax value for deterministic games to an
expecti-minimax values for chance-node games.
[+] Terminal, MAX, MIN nodes (when dice roll is known) work exactly the
same way before.
[+] For chance nodes, we calculate the expected value - sum of value over
all outcomes, weighted by probability.
EXPECTI-MINIMAX(S) = UTILITY(S) if
TERMINAL-TEST(S)
max(..a) EXPECTI-MINIMAX(RESULT(S, a)) if
PLAYER(S) = MAX
min(..a) EXPECTI-MINIMAX(RESULT(S, a)) if
PLAYER(S) = MIN
sum of P(r)*EXPECTI-MINIMAX(RESULT(S, r)) if
PLAYER(S) = CHANCE
[~] The presence of chance nodes make the evaluation function more sensitive:
[-] Program behaves totally different when changing the scale of some
evaluation values!
[-] To avoid this sensitivity, EVAL must be a positive linear transformation of
probability of winning from a position.

You might also like