0% found this document useful (0 votes)
4 views

AI Unit 2 Adversarial Search

This document discusses multi-agent environments and adversarial search in AI, focusing on competitive scenarios where agents' goals conflict, leading to games. It covers concepts such as perfect and imperfect information games, zero-sum games, and the minimax algorithm for determining optimal strategies. The document also details the formalization of games, game trees, and the complexities involved in adversarial search problems.

Uploaded by

Aisha Sarkar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

AI Unit 2 Adversarial Search

This document discusses multi-agent environments and adversarial search in AI, focusing on competitive scenarios where agents' goals conflict, leading to games. It covers concepts such as perfect and imperfect information games, zero-sum games, and the minimax algorithm for determining optimal strategies. The document also details the formalization of games, game trees, and the complexities involved in adversarial search problems.

Uploaded by

Aisha Sarkar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 51

UNIT 2

🞆 Multi-agent environments, in which each agent needs to


consider the actions of other agents and how they affect its
own welfare.

🞆 The unpredictability of these other agents can introduce


contingencies into the agent’s problem-solving process

🞆 This unit covers competitive environments, in which the


agents’ goals are in conflict, giving rise to adversarial
search problems—often known as games.
UNIT-2

🞆 ADVERSARIAL SEARCH

🞆 In which we examine the problems that arise


when we try to plan ahead in a world where
other agents are planning against us.
ADVERSIAL SEARCH
🞆 Mathematical game theory, a branch of economics, views any multiagent
environment as a game, provided that the impact of each agent on the others is
“significant,” regardless of whether the agents are cooperative or competitive.

• In previous topics, we have studied the search strategies which are only
associated with a single agent that aims to find the solution which often
expressed in the form of a sequence of actions.

• The environment with more than one agent is termed as multi-agent environment
where each agent is an opponent of other agent, playing against each other
considering the action of other agent and effect of that action on their performance

• So, Searches in which two or more players with conflicting goals are
trying to explore the same search space for the solution, are called
adversarial searches, often known as Games.

• Games are modeled as a Search problem and heuristic evaluation function, and
these are the two main factors which help to model and solve games in AI.
TYPES OF GAMES IN AI:
Deterministic Chance Moves
(Non-D/stochastic)
Perfect information Chess, Checkers Backgammon, monopoly

Imperfect information Battleships, blind, tic-tac-toe Bridge, poker, scrabble,


nuclear war
•Perfect information: A game with the perfect information is that in which agents can
look into the complete board such that they can see each other moves. Examples are
Chess, Checkers, Go, etc.

•Imperfect information: If in a game agents do not have all information about the game
and not aware with what's going on. Examples such as tic-tac-toe, Battleship

•Deterministic games: Deterministic games are those games which follow a strict
pattern and set of rules for the games, and there is no randomness associated with them.
Examples are chess, Checkers, Go, tic-tac-toe, etc.

•Non-deterministic games(stochastic games): Non-deterministic are those games which


have various unpredictable events and has a factor of chance by using dice or cards.
Example: Backgammon, Monopoly, Poker, etc.
ZERO-SUM GAME
• Zero-sum games are adversarial search which involves pure competition,
which means, In Zero-sum game each agent's gain or loss of utility is exactly
balanced by the losses or gains of utility of another agent.
• One player of the game tries to maximize one single value, while other
player tries to minimize it.
• Each move by one player in the game is called as ply.
• Chess and tic-tac-toe are examples of a Zero-sum game.
Zero-sum game: Embedded thinking
The Zero-sum game involved embedded thinking in which one agent or player is
trying to figure out:
 What to do.
 How to decide the move
 Needs to think about his opponent as well
• The opponent also thinks what to do
🞆 Each of the players is trying to find out the response of his opponent to their
actions. This requires embedded thinking or backward reasoning to solve
the game problems in AI.
FORMALIZATION OF THE PROBLEM
🞆A game can be defined as a type of search in AI which can be formalized of the
following elements:

• Initial state: It specifies how the game is set up at the start.


• Player(s): It specifies which player has moved in the state space.
• Action(s): It returns the set of legal moves in state space.
• Result(s, a): It is the transition model, which specifies the result of moves in the state
space.
• Terminal-Test(s): Terminal test is true if the game is over, else it is false at any case.
The state where the game ends is called terminal states.
• Utility(s, p): A utility function gives the final numeric value for a game that ends
in terminal states s for player p. It is also called payoff function.
Eg.: For Chess, the outcomes are a win, loss, or draw and its
payoff values are +1, 0, ½.
And for tic-tac-toe, utility values are +1, -1, and 0.
GAME TREE:
🞆 A game tree is a tree where nodes of the tree are the game states and Edges
of the tree are the moves by players.
🞆 Game tree involves initial state, actions function, and result Function.
🞆 Example: Tic-Tac-Toe game tree:
🞆 The following figure is showing part of the game-tree for tic-tac-toe game.
Following are some key points of the game:
• There are two players MAX and MIN.
• Players have an alternate turn and start with MAX.
• MAX maximizes the result of the game tree
• MIN minimizes the result.
🞆 Games, like the real world, therefore require the ability to make some decision
even when calculating the optimal
🞆 decision is infeasible.
🞆 Game-playing research has therefore spawned a number of interesting ideas on
how to make the best possible use of time.
MIN-MAX GAME
🞆 We first consider games with two players,
whom we call MAX and MIN for reasons that
will soon become obvious.
🞆 MAX moves first, and then they take turns
moving until the game is over.
🞆 At the end of the game, points are awarded
to the winning player and penalties are given
to the loser.
🞆 A game can be formally defined as a kind of
search problem with the following elements:
🞆 S0: The initial state, which specifies how the
game is set up at the start.
🞆 PLAYER(s): Defines which player has the move in
a state.
🞆 ACTIONS(s): Returns the set of legal moves in a
state.
🞆 RESULT(s, a): The transition model, which
defines the result of a move.
🞆 TERMINAL-TEST(s): A terminal test, which is true
when the game is over and false otherwise.
States where the game has ended are called
terminal states.
🞆 TIC TAC TOE Explanation:

• From the initial state, MAX has 9 possible moves as he starts first. MAX place
x and MIN place o, and both player plays alternatively until we reach a leaf
node where one player has three in a row or all squares are filled.
• Both players will compute for each node, minimax, the minimax value which
is the best achievable utility against an optimal adversary.
• Suppose both the players are well aware of the tic-tac-toe and playing the best
play. Each player is doing his best to prevent another one from winning. MIN
is acting against Max in the game.
• So in the game tree, we have a layer of Max, a layer of MIN, and each layer is
called as Ply. Max place x, then MIN puts o to prevent Max from winning,
and this game continues until the terminal node.
• In this either MIN wins, MAX wins, or it's a draw. This game-tree is the whole
search space of possibilities that MIN and MAX are playing tic-tac-toe and
taking turns alternately.
 Hence adversarial Search for the minimax procedure works as follows:

• It aims to find the optimal strategy for MAX to win the game.
• In the game tree, optimal leaf node could appear at any depth of the tree. It
follows the approach of Depth-first search.
• Propagate the minimax values up to the tree until the terminal node
discovered.
🞆 In a given game tree, the optimal strategy can be determined from the minimax
value of each node, which can be written as MINIMAX(n). MAX prefer to move
to a state of maximum value and MIN prefer to move to a state of minimum
value then:

MINIMAX(s) =
⎧ UTILITY(s) if TERMINAL-TEST(s)
⎨ maxa∈Actions(s) MINIMAX(RESULT(s, a)) if PLAYER(s) = MAX

⎩ mina∈Actions(s) MINIMAX(RESULT(s, a) if PLAYER(s) = MIN


🞆 For tic-tac-toe the game tree is relatively
small—fewer than 9! = 362, 880 terminal
nodes. But for chess there are over 1040
nodes, so the game tree is best thought of
as a theoretical construct that we cannot
realize in the physical world

🞆 But regardless of the size of the game


tree, it is MAX’s job to search for a
good move

🞆 We use the term search tree for a tree that


is superimposed on the full game tree,
and examines enough nodes to allow a
player to determine what move to make.
OPTIMAL DECISIONS IN GAMES
🞆 In a normal search problem, the optimal
solution would be a sequence of actions
leading to a goal state—a terminal state that
is a win.
🞆 In adversarial search, MIN has something to
say about it.
🞆 MAX therefore must find a contingent
strategy, which specifies
🞆 MAX’s move in the initial state, then MAX’s
moves in the states resulting from every
possible response by MIN to those moves,
and so on.
2-PLY GAME
2-PLY GAME
🞆 Even a simple game like tic-tac-toe is too complex for us to draw
the entire game tree on one page, so we will switch to the trivial
game in Figure 5.2.
🞆 The possible moves for MAX at the root node are labeled a1, a2, and

a3.

🞆 The possible replies to a1 for MIN are b1, b2, b3, and so on.
🞆 This particular game ends after one move each by MAX and MIN.
(In game parlance, we say that this tree is one move deep, consisting
of two half-moves, each of which is called a ply.)
🞆 The utilities of the terminal states in this game range from 2 to 14.
🞆 Given a game tree, the optimal strategy can be determined
from the minimax value of each node, which we write as
MINIMAX(n).
🞆 The minimax value of a node is the utility (for MAX) of being
in the corresponding state, assuming that both players play
optimally from there to the end of the game.
🞆 Obviously, the minimax value of a terminal state is just its
utility.
🞆 Furthermore, given a choice, MAX prefers to move to a state
of maximum value, whereas MIN prefers a state of minimum
value.
🞆 So we have the following:
🞆 MINIMAX(s) =
🞆 ⎧ UTILITY(s) if TERMINAL-TEST(s)
🞆 ⎨ maxa∈Actions(s) MINIMAX(RESULT(s, a)) if
PLAYER(s) = MAX
🞆 ⎩ mina∈Actions(s) MINIMAX(RESULT(s, a)) if
PLAYER(s) = MIN
🞆 Let us apply these definitions to the game tree in Figure 5.2.
🞆 The terminal nodes on the bottom level get their utility values from the
game’s UTILITY function.
🞆 The first MIN node, labeled B, has three successor states with values 3,
12, and 8, so its minimax value is 3.
🞆 Similarly, the other two MIN nodes have minimax value 2.
🞆 The root node is a MAX node; its successor states have minimax values 3,
2, and 2; so it has a minimax value of 3.
🞆 We can also identify the minimax decision
at the root: action a1 is the optimal choice for
MAX because it leads to the state with the
highest minimax value.
🞆 This definition of optimal play for MAX
assumes that MIN also plays optimally—it
maximizes the worst-case outcome for MAX.
What if MIN does not play optimally? Then it
is easy to show that MAX will do even better.
THE MINIMAX ALGORITHM
🞆 The minimax algorithm (Figure 5.3) computes the minimax decision from
the current state. It uses a simple recursive computation of the minimax
values of each successor state, directly implementing the defining
equations. The recursion proceeds all the way down to the leaves of the tree,
and then the minimax values are backed up through the tree as the
recursion unwinds. For example, in Figure 5.2, the algorithm first recurses
down to the three bottom- left nodes and uses the UTILITY function on
them to discover that their values are 3, 12, and 8, respectively. Then it
takes the minimum of these values, 3, and returns it as the backed- up value
of node B. A similar process gives the backed-up values of 2 for C and 2 for
D. Finally, we take the maximum of 3, 2, and 2 to get the backed-up value
of 3 for the root node. The minimax algorithm performs a complete depth-
first exploration of the game tree.
🞆 If the maximum depth of the tree is m and there are b legal moves at each
point, then the time complexity of the minimax algorithm is O(b m). The
space complexity is O(bm) for an algorithm that generates all actions at
once, or O(m) for an algorithm that generates actions one at a time (see
page 87). For real games, of course, the time cost is totally impractical, but
this algorithm serves as the basis for the mathematical analysis of games
and for more practical algorithms.
EXAMPLE PROBLEM
🞆 Step-1: In the first step, the algorithm generates the entire game-tree and
apply the utility function to get the utility values for the terminal states. In
the below tree diagram, let's take A is the initial state of the tree. Suppose
maximizer takes first turn which has worst-case initial value =- infinity, and
minimizer will take next turn which has worst-case initial value = +infinity.
Step 2: Now, first we find the utilities value for the Maximizer, its
initial value is -∞, so we will compare each value in terminal state with
initial value of Maximizer and determines the higher nodes values. It
will find the maximum among the all.
•For node D max(-1,- -∞) => max(-1,4)= 4
•For Node E max(2, -∞) => max(2, 6)= 6
•For Node F max(-3, -∞) => max(-3,-5) = -3
•For node G max(0, -∞) = max(0, 7) = 7
Step 3: In the next step, it's a turn for minimizer, so it will compare
all nodes value with +∞, and will find the 3rd layer node values.
•For node B= min(4,6) = 4
•For node C= min (-3, 7) = -3
Step 4: Now it's a turn for Maximizer, and it will again choose the
maximum of all nodes value and find the maximum value for the root
node. In this game tree, there are only 4 layers, hence we reach
immediately to the root node, but in real games, there will be more
than 4 layers.
•For node A max(4, -3)= 4
🞆 Properties of Mini-Max algorithm:
• Complete- Min-Max algorithm is Complete. It will definitely find a solution
(if exist), in the finite search tree.
• Optimal- Min-Max algorithm is optimal if both opponents are playing
optimally.
• Time complexity- As it performs DFS for the game-tree, so the time
complexity of Min-Max algorithm is O(bm), where b is branching factor of the
game-tree, and m is the maximum depth of the tree.
• Space Complexity- Space complexity of Mini-max algorithm is also similar
to DFS which is O(bm).
🞆 Limitation of the minimax Algorithm:
🞆 The main drawback of the minimax algorithm is that it gets really slow for
complex games such as Chess, go, etc. This type of games has a huge
branching factor, and the player has lots of choices to decide. This limitation of
the minimax algorithm can be improved from alpha-beta pruning which we
have discussed in the next topic.
ALPHA-BETA PRUNING
• Alpha-beta pruning is a modified version of the minimax algorithm. It is an
optimization technique for the minimax algorithm.
• As we have seen in the minimax search algorithm that the number of game
states it has to examine are exponential in depth of the tree. Since we cannot
eliminate the exponent, but we can cut it to half.
• Hence there is a technique by which without checking each node of the game tree
we can compute the correct minimax decision, and this technique is called pruning.
• This involves two threshold parameter Alpha and beta for future expansion, so
it is called alpha-beta pruning. It is also called as Alpha-Beta Algorithm.
• Alpha-beta pruning can be applied at any depth of a tree, and sometimes it not only
prune the tree leaves but also entire sub-tree.

• The two-parameter can be defined as:


• Alpha: The best (highest-value) choice we have found so far at any point along
the path of Maximizer. The initial value of alpha is -∞.

• Beta: The best (lowest-value) choice we have found so far at any point along
the path of Minimizer. The initial value of beta is +∞.
🞆 The Alpha-beta pruning to a standard
minimax algorithm returns the same move
as the standard algorithm does, but it
removes all the nodes which are not
really affecting the final decision but
making algorithm slow. Hence by pruning
these nodes, it makes the algorithm fast

🞆 Alpha–beta pruning can be applied to


trees of any depth, and it is often possible
to prune entire subtrees rather than just
leaves.
CONDITIONS:
🞆 Condition for Alpha-beta pruning: α>=β

🞆 Key points about alpha-beta pruning:

• The Max player will only update the value of alpha


• The Min player will only update the value of beta

• While backtracking the tree, the node values will be passed to upper
nodes instead of values of alpha and beta

• We will only pass the alpha, beta values to the child nodes
Step 1: At the first step the, Max player will start first move
from node A where α= -∞ and β= +∞, these value of alpha and beta
passed down to node B where again α= -∞ and β= +∞, and Node B
passes the same value to its child D.
Step 2: At Node D, the value of α will be calculated as its turn for
Max. The value of α is compared with firstly 2 and then 3, and the max
(2, 3) = 3 will be the value of α at node D and node value will also 3

Step 3: Now algorithm backtrack to node B, where the value of β


will change as this is a turn of Min, Now β= +∞, will compare with
the available subsequent nodes value, i.e. min (∞, 3) = 3, hence at node
B now α= -∞, and β= 3.

In the next step, algorithm traverse the next successor of Node B which
is node E, and the values of α= -∞, and β= 3 will also be passed.
Step 4: At node E, Max will take its turn, and the value of alpha will
change. The current value of alpha will be compared with 5, so max (-∞,
5) = 5, hence at node E α= 5 and β= 3, where α>=β, so the right
successor of E will be pruned, and algorithm will not traverse it,
and the value at node E will be 5.
Step 5: At next step, algorithm again backtrack the tree, from
node B to node A. At node A, the value of alpha will be changed the
maximum available value is 3 as max (-∞, 3)= 3, and β= +∞, these
two values now passes to right successor of A which is Node C.
At node C, α=3 and β= +∞, and the same values will be passed on to
node F.
Step 6: At node F, again the value of α will be compared with left child
which is 0, and max(3,0)= 3, and then compared with right child which
is 1, and max(3,1)= 3 still α remains 3, but the node value of F will
become 1.
🞆 Step 7: Node F returns the node value 1 to node C, at C α= 3
and β= +∞, here the value of beta will be changed, it will
compare with 1 so min (∞, 1) = 1. Now at C, α=3 and β= 1,
and again it satisfies the condition α>=β, so the next child
of C which is G will be pruned, and the algorithm will not
compute the entire sub-tree G.
🞆 Step 8: C now returns the value of 1 to A here the best value
for A is max (3, 1) = 3. Following is the final game tree which
is the showing the nodes which are computed and nodes
which has never computed. Hence the optimal value for the
maximizer is 3 for this example.
2-PLY WITH ALPHA BETA PRUNING
🞆 Consider again the two-ply game tree from
Figure 5.2. Let’s go through the calculation of
the optimal decision once more, this time
paying careful attention to what we know at
each point in the process. The steps are
explained in Figure 5.5. The outcome is that
we can identify the minimax decision without
ever evaluating two of the leaf nodes.
🞆 The general principle is this: consider a node
n somewhere in the tree (see Figure 5.6),
such that Player has a choice of moving to
that node. If Player has a better choice m
either at the parent node of n or at any
choice point further up, then n will never be
reached in actual play. So once we have
found out enough about n (by examining
some of its descendants) to reach this
conclusion, we can prune it.
🞆 Alpha–beta search updates the values of α
and β as it goes along and prunes the
remaining branches at a node (i.e.,
terminates the recursive call) as soon as the
value of the current node is known to be
worse than the current α or β value for MAX
or MIN, respectively. The complete algorithm
is given in Figure 5.7.
SUMMARY
🞆 in each state, the result of each action, a
terminal test (which says when the game is
over), and a utility function that applies to
terminal states.
🞆 In two-player zero-sum games with perfect
information, the minimax algorithm can
select optimal moves by a depth-first
enumeration of the game tree.
🞆 The alpha–beta search algorithm computes
the same optimal move as minimax, but
achieves much greater efficiency by
eliminating subtrees that are provably
irrelevant.
OPTIMAL DECISIONS IN
MULTIPLAYER GAMES

🞆 Many popular games allow more than two players. Let us examine
how to extend the minimax idea to multiplayer games. This is
straightforward from the technical viewpoint, but raises some
interesting new conceptual issues.
🞆 First, we need to replace the single value for each node with a
vector of values. For example, in a three-player game with players
A, B, and C, a vector (vA, vB, vC ) is associated with each node. For
terminal states, this vector gives the utility of the state from each
player’s
🞆 viewpoint. (In two-player, zero-sum games, the two-element vector
can be reduced to a single value because the values are always
opposite.) The simplest way to implement this is to have the UTILITY
function return a vector of utilities.
🞆 Now we have to consider nonterminal states.
Consider the node marked X in the game tree
shown in Figure 5.4. In that state, player C
chooses what to do. The two choices lead
🞆 to terminal states with utility vectors (vA = 1,
vB = 2, vC = 6) and (vA = 4, vB = 2, vC = 3).
Since 6 is bigger than 3, C should choose the
first move. This means that if state X is
reached,
🞆 subsequent play will lead to a terminal state
with utilities (vA = 1, vB = 2, vC = 6). Hence,
the backed-up value of X is this vector. The
backed-up value of a node n is always the
🞆 vector of the successor state with the highest
value for the player choosing at n. Anyone
who plays multiplayer games, such as
Diplomacy, quickly becomes aware that
much more is going on than in two-player
games. Multiplayer games usually involve
alliances, whether formal or informal,
among the players. Alliances are made and
broken as the game proceeds. How are we to
understand such behavior? Are alliances a
natural consequence of optimal strategies for
each player in a multiplayer game? It turns
out that they can be. For example,
🞆 suppose A and B are in weak positions and C is in a stronger
position. Then it is often optimal for both A and B to attack C
rather than each other, lest C destroy each of them
individually. In this way, collaboration emerges from purely
selfish behavior. Of course, as soon as C weakens under
the joint onslaught, the alliance loses its value, and either
A or B could violate the agreement. In some cases, explicit
alliances merely make concrete what would have happened
anyway. In other cases, a social stigma attaches to breaking
an alliance, so players must balance the immediate
advantage of breaking an alliance against the long-term
disadvantage of being perceived as untrustworthy
🞆 If the game is not zero-sum, then
collaboration can also occur with just two
players. Suppose, for example, that there is a
terminal state with utilities (vA = 1000, vB =
1000) and that 1000 is the highest possible
utility for each player. Then the optimal
strategy is for both
🞆 players to do everything possible to reach
this state—that is, the players will
automatically cooperate to achieve a
mutually desirable goal.

You might also like