0% found this document useful (0 votes)

43 views118 pages

AI - Unit 3

The document discusses adversarial search problems and intelligent agents, focusing on game theory concepts such as the minimax algorithm and alpha-beta pruning. It categorizes games into types based on information availability and determinism, explaining zero-sum games and the formalization of game problems. Additionally, it details the steps of the minimax algorithm and its application in two-player games, emphasizing optimal strategies and scoring mechanisms.

Uploaded by

JATIN V

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

43 views118 pages

AI - Unit 3

Uploaded by

JATIN V

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 118

UNIT-III

Adversarial Search Problems and Intelligent Agent

Adversarial Search Methods (Game Theory) - Mini max algorithm - Alpha beta pruning -
Constraint satisfactory problems – Constraints – Crypt Arithmetic Puzzles – Constraint
Domain – CSP as a search problem (Room colouring).
Adversarial search

• Adversarial search is a search, where we examine the problem which arises when we try
to plan ahead of the world and other agents are planning against us.
• The Adversarial Search involves more than one entity, each with competing aims and
purposes. These entities are put against one another in a game-like environment and each
player's strategy or game approach alters depending on the opponent's move.
• Used in game playing in which one can trace the movement of an enemy or opponent.
Adversarial search
• Blind and Heuristic search strategies are only associated with a single agent that aims to
find the solution which often expressed in the form of a sequence of actions.
• But there might be some situations where more than one agent is searching for the solution
in the same search space, and this situation usually occurs in game playing.
• The environment with more than one agent is termed as multi-agent environment, and in
game each agent is an opponent of other agent and playing against each other. Each agent
needs to consider the action of other agent and effect of that action on their performance.
• Searches in which two or more players with conflicting goals are trying to explore the same
search space for the solution, are called adversarial searches, often known as Games.
• Games are modeled as a Search problem and their heuristic evaluation function, are the two
main factors which help to model and solve games in AI.
Types of Games in AI
Perfect information
• Agents can look into the complete board.
• Agents have all the information about the game, and they can see each other moves also.
• Examples are Chess, Checkers, Go, etc.
Imperfect information
• Agents do not have all information about the game and not aware with what's going on,
• Examples are Tic-tac-toe, Battleship, blind, Bridge, etc.
Deterministic games
• Games which follow a strict pattern and set of rules for the games
• There is no randomness associated with them.
• Examples are chess, Checkers, Go, tic-tac-toe, etc.
Non-deterministic games
• Games which have various unpredictable events and has a factor of chance or luck.
• This factor of chance or luck is introduced by either dice or cards.
• These are random, and each action response is not fixed.
• Such games are also called as stochastic games.
• Example: Backgammon, Monopoly, Poker, etc.
Zero-Sum Game
Zero-sum games is a mathematical representation in game theory and economic theory of a
situation that involves two sides, where the result is an advantage for one side and an equivalent
loss for the other.
In other words, player one's gain is equivalent to player two's loss, with the result that the net
improvement in benefit of the game is zero.

Chess and tic-tac-toe are examples of a Zero-sum game.

The Zero-sum game involves embedded thinking in which one agent or player is trying to figure
out:
• What to do?.
• How to decide the move?
• Needs to think about his opponent as well
• The opponent also thinks what to do
• Each of the players is trying to find out the response of his opponent to their actions.

This requires embedded thinking or backward reasoning to solve the game problems in AI
Formalization of the problem
A game can be defined as a type of search in AI which can be formalized of the following
elements:
• Initial state: It specifies how the game is set up at the start.
• Player(s): It specifies which player has moved in the state space.
• Action(s): It returns the set of legal moves in state space.
• Result(s, a): It is the transition model, which specifies the result of moves in the state
space.
• Terminal-Test(s): Terminal test is true if the game is over, else it is false at any case. The
state where the game ends is called terminal states.
• Utility(s, p): A utility function gives the final numeric value for a game that ends in
terminal states s for player p. It is also called payoff function.
For Chess, the outcomes are a win, loss, or draw and its payoff values are +1, 0, ½.
And for tic-tac-toe, utility values are +1, -1, and 0.
Mini-Max Algorithm
• Minimax is a kind of backtracking algorithm that is used in decision making and game
theory to find the optimal move for a player, assuming that your opponent also plays
optimally.
• It is widely used in two player turn-based games such as Tic-Tac-Toe, Backgammon,
Mancala, Chess, etc.
• In Min imax the two players are called maximizer and minimizer.
• The maximizer (Player 1) tries to get the highest score possible while
the minimizer (Player 2) tries to minimize the score of player
Minimax Algorithm Steps

1. Generate the Game Tree: Create a tree of all possible moves.

2. Assign Scores to Leaf Nodes: Based on the final outcomes.
3. Backtrack Using Minimax:
1. At MIN nodes (Player 2) → Choose the value that minimizes the score of Player 1.
2. At MAX nodes (Player 1) → Choose the value that maximizes the score of Player 1.
4. Choose the Best Move at the Root: The move that leads to the highest possible score for Player 1.
Game tree
• A game tree is a type of recursive search function
that examines all possible moves of a strategy game,
and their results, in an attempt to ascertain the
optimal move.
• A game tree is a tree where nodes of the tree are the
game states and edges of the tree are the moves by
players.
• Game tree involves initial state, action's function,
and result function.
• Two-person games can also be represented as and-or
trees.
• For the first player to win a game, there must exist a
winning move for all moves of the second player.
Example: Tic-Tac-Toe game tree
Tic-Tac-Toe Game tree
• From the initial state, MAX has 9 possible moves as he starts
first. MAX place x and MIN place o, and both player plays
alternatively until we reach a leaf node where one player has
three in a row or all squares are filled.
• Both players will compute each node, minimax, the minimax
value which is the best achievable utility against an optimal
adversary.
• Suppose both the players are well aware of the tic-tac-toe and
playing the best play. Each player is doing his best to prevent
another one from winning. MIN is acting against Max in the
game.
• So in the game tree, we have a layer of Max, a layer of MIN,
and each layer is called as Ply. Max place x, then MIN puts o to
prevent Max from winning, and this game continues until the
terminal node.
• In this either MIN wins, MAX wins, or it's a draw. This game-
tree is the whole search space of possibilities that MIN and
Assign Scores to Leaf Nodes
• In the Minimax algorithm, leaf nodes represent terminal states (end of the game).
• The scores assigned to these nodes depend on the game’s outcome.
How to Assign Scores?

• Win/Loss Games (e.g., Chess, Tic-Tac-Toe)

Win for Maximizing Player (Player 1) → +∞ (e.g., +1)
Win for Minimizing Player (Player 2) → -∞ (e.g., -1)
Draw → 0

• Scoring-Based Games
The final difference between Player 1 and Player 2’s
scores is used.
Player 1 Score - Player 2 Score determines the value of
each leaf node.
Mini-Max Algorithm
In the below tree diagram, find the utility values and best strategy for Max.
let's take A is the initial state of the tree.
Step-1: In the first step of the algorithm game-tree is generated which is given
Step 2: The utility values/Scores for the terminal states are given.
Step 3: Backtracking
• Suppose maximizer takes first
turn which has worst-case initial
value =- infinity, and
• minimizer will take next turn
which has worst-case initial
value = +infinity.
• Now, first we find the utilities value for the Maximizer, its initial value is -∞, so we will
compare each value in terminal state with initial value of Maximizer and determines the
higher nodes values. It will find the maximum among the all.
• In the next step, it's a turn for minimizer, so it will compare
all nodes value with +∞, and will find the 3rd layer node
values.

For node B= min(4,6) = 4

For node C= min (-3, 7) = -3

• Now it's a turn for Maximizer, and it will again choose the
maximum of all nodes value and find the maximum value
for the root node. In this game tree, there are only 4 layers,
hence we reach immediately to the root node, but in real
games, there will be more than 4 layers.
For node A max(4, -3)= 4
That was the complete workflow of the minimax two player
game.

Best Strategy: A → B → D → I ✅ Winning Score: 4

Mini-Max Algorithm

48 30
Minimax(N)
1. if N is a terminal node
2. value ← eval(N) 48 74 30 45
3. else if N is a max node
4. value ← - infinity 11 48 53 74 23 30 50 45
5. for each child C of N
6. value ← max(value, Minimax(C))
7. else value ← +infinity
8. for each child C of N
9. value ← min(value, Minimax(C))
10. return value
Example 1 The figure shows a game tree with evaluations W (win), L (loss) and D (draw) from Max's
perspective. In this game tree the labels P, Q, R, S, T indicate strategies/moves at the root.

W L W L L

W L W W L

L L D D L L

1. What is the outcome (W, D or L) of the game when both players play perfectly?
2. Which of the moves P, Q, R, S, T are the best moves for Max?
3. Which of the moves P, Q, R, S, T are the best moves for MIN?
Example 2: The figure shows a 4-ply game tree with evaluation function values at the horizon. The nodes in the
horizon are assigned reference numbers A,B,C,...,P.

1. What is the Minimax value of the game?

Example 3:
• You are given an integer array.
• Two players are playing a game with this array: player 1 and player 2.
• Player 1 and player 2 take turns, with player 1 starting first.
• Both players start the game with a score of 0.
• At each turn, the player takes one of the numbers from any one end of the array which reduces the
size of the array by 1.
• The player adds the chosen number to their score.
• The game ends when there are no more elements in the array.
• Return true if Player 1 can win the game or false if player 2 wins.
• If the scores of both players are equal, then player 1 is still the winner, and return true.
• You may assume that both players are playing optimally.
Input: [1,5,233,7]
Solution: Create Game Tree

Root Node Player 1’s Turn Pick (1) [1, 5, 233, 7] Pick(7)

Level 1
(Player 2’s Turn) [5, 233, 7] Pick (7) Pick (1) [1, 5, 233] Pick (233)
Pick (5)

Level 2
(Player 1’s Turn) Pick (233) [233, 7] Pick(7) Pick(5) [5, 233] Pick (233) Pick (5) [5, 233] Pick (233) Pick(1) [1, 5] Pick (233)

Level 3
(Player 2’s Turn
[ 7] [233] [233] [5] [233] [5] [5] [1]
Final Move – Gets
left out value)
Score Calculation
Back Propagation

Player 1 [1,5,233,7]
[5]

Player 2 [5,233,7]
[7] [5]
[1,5,233]

Player 1 [233,7]
[7] [233,5]
[5] [5]
[233,5] [5,1]
[1]

Player 2 [7] [233] [233] [5] [233] [5] [5] [1]

Pick (233) Pick (233) Pick (233) Pick (5)

The best move for Player 2 is the one that

minimizes the opponent’s maximum score.
Back Propagation
Player 1 234
Max

Player 2 222 234

Min

Player 1 222 222 234 234

Max

Player 2 222 -230 -234 222 -222 234 -230 234

Min

The final difference between Player 1 and Player 2’s scores is used.
Player 1 Score - Player 2 Score determines the value of each leaf node.
Winning Strategy

[1, 5, 233, 7]
Pick(7)

[5, 233, 7] [1, 5, 233]

[233, 7] [5, 233] [5, 233] [1, 5]

[ 7] [233] [233] [5] [233] [5] [5] [1]

Player 1 can with start Picking (7) to win

Example 4
• Alice and Bob play a game with piles of stones. There are an even number of piles arranged
in a row, and each pile has a positive integer number of stones piles[i].
• The objective of the game is to end with the most stones. The total number of stones across
all the piles is odd, so there are no ties.
• Alice and Bob take turns, with Alice starting first. Each turn, a player takes the entire pile of
stones either from the beginning or from the end of the row. This continues until there are no
more piles left, at which point the person with the most stones wins.
• Assuming Alice and Bob play optimally, return true if Alice wins the game, or false if Bob
wins.
Input: piles = [5,3,4,5], Output: true
Explanation:
Alice starts first, and can only take the first 5 or the last 5.
Say she takes the first 5, so that the row becomes [3, 4, 5].
If Bob takes 3, then the board is [4, 5], and Alice takes 5 to win with 10 points.
If Bob takes the last 5, then the board is [3, 4], and Alice takes 4 to win with 9 points.
This demonstrated that taking the first 5 was a winning move for Alice, so we return true.
Mini-Max Algorithm
Properties of Mini-Max algorithm
• It follows the approach of Depth-first search.
• In the game tree, optimal leaf node could appear at any depth of the tree.
• Complete- Min-Max algorithm is Complete. It will definitely find a solution (if exist), in
the finite search tree.
• Optimal- Min-Max algorithm is optimal if both opponents are playing optimally.
Limitation of the minimax Algorithm:
• The main drawback of the minimax algorithm is that it gets really slow for complex
games such as Chess, go, etc.
• This type of games has a huge branching factor, and the player has lots of choices to
decide. This limitation of the minimax algorithm can be improved from alpha-beta
pruning.

Time complexity : O(b^d) b is the branching factor and d is count of depth or ply of graph or tree.
Space Complexity : O(bd) where b is branching factor into d is maximum depth of tree similar to DFS.
Alpha-Beta Pruning
• Alpha-Beta Pruning is an optimization technique for the Minimax algorithm used in
decision-making and game theory.
• It helps reduce the number of nodes that need to be evaluated in a game tree, making
Minimax more efficient without affecting the final decision.
• In the minimax search algorithm that the number of game states it has to examine are
exponential in depth of the tree.
• Alpha-beta pruning is a modified version of the minimax algorithm.
• Since we cannot eliminate the exponent, but we can cut it to half. Hence there is a
technique by which without checking each node of the game tree we can compute the
correct minimax decision, and this technique is called pruning.
• Alpha-beta pruning can be applied at any depth of a tree, and sometimes it not only prune
the tree leaves but also entire sub-tree.
Alpha-Beta Pruning
• The Alpha-beta pruning to a standard minimax algorithm returns the same move as the standard
algorithm does, but it removes all the nodes which are not really affecting the final decision but
making algorithm slow. Hence by pruning these nodes, it makes the algorithm fast.

This involves two threshold parameter Alpha and beta for future expansion, so it is called
alpha-beta pruning. It is also called as Alpha-Beta Algorithm.
The two-parameter can be defined as:
• Alpha: The best (highest-value) choice we have found so far at any point along the path of
Maximizer. The initial value of alpha is -∞.
• Beta: The best (lowest-value) choice we have found so far at any point along the path of
Minimizer. The initial value of beta is +∞.

Condition for Alpha-beta pruning:

The main condition which required for alpha-beta pruning is:
α >= β
Alpha-Beta Steps

1. Generate the Game Tree: Create a tree of all possible moves.

2. Assign Scores to Leaf Nodes: Based on the final outcomes.
3. Perform Alpha Beta pruning
4. Backtrack Using Minimax:
1. At MIN nodes (Player 2) → Choose the value that minimizes the score of Player 1.
2. At MAX nodes (Player 1) → Choose the value that maximizes the score of Player 1.
5. Choose the Best Move at the Root: The move that leads to the highest possible score for Player
Alpha-Beta Pruning

• At the first step the, Max player will start first move • At Node D, the value of α will be calculated as its turn
from node A where α= -∞ and β= +∞, these value of for Max. The value of α is compared with firstly 2 and
alpha and beta passed down to node B where again α= then 3, and the max (-∞ 2, 3) = 3 will be the value of α
-∞ and β= +∞, and Node B passes the same value to its at node D and node value will also 3.
child D.
• Now algorithm backtrack to node B, where the value • In the next step, algorithm traverse the next successor
of β will change as this is a turn of Min, Now β= +∞, of Node B which is node E, and the values of α= -∞,
will compare with the available subsequent nodes and β= 3 will also be passed.
value, i.e. min (∞, 3) = 3, hence at node B now α= -∞,
and β= 3.
• At node E, Max will take its turn, and the value of • At next step, algorithm again backtrack the tree, from
alpha will change. The current value of alpha will be node B to node A. At node A, the value of alpha will
compared with 5, so max (-∞, 5) = 5, hence at node E be changed the maximum available value is 3 as max
α= 5 and β= 3, where α>=β, so the right successor of (-∞, 3)= 3, and β= +∞, these two values now passes
E will be pruned, and algorithm will not traverse it, to right successor of A which is Node C.
and the value at node E will be 5.
At node C, α=3 and β= +∞, and the same values will be
passed on to node F.
• At node F, again the value of α will be compared • Node F returns the node value 1 to node C, at C α= 3
with left child which is 0, and max(3,0)= 3, and then and β= +∞, here the value of beta will be changed, it
compared with right child which is 1, and max(3,1)= will compare with 1 so min (∞, 1) = 1.
3 still α remains 3, but the node value of F will • Now at C, α=3 and β= 1, and again it satisfies the
become 1. condition α>=β, so the next child of C which is G will
be pruned, and the algorithm will not compute the
entire sub-tree G.
• C now returns the value of 1 to A here the best value for A is max (3, 1) = 3. Following is the final game tree
which is the showing the nodes which are computed and nodes which has never computed. Hence the optimal
value for the maximizer is 3 for this example.
Example 1: When AlphaBeta(root,-INF,+INF) is invoked, it passes the (alpha,beta) bounds to descendants,
where the bounds are updated with values received from horizon nodes.

List the horizon nodes pruned by alpha-beta algorithm

Example 2: When AlphaBeta(root,-INF,+INF) is invoked, it passes the (alpha,beta) bounds to descendants,
where the bounds are updated with values received from horizon nodes.

List the horizon nodes pruned by alpha-beta algorithm

Example 3: When AlphaBeta(root,-INF,+INF) is invoked, it passes the (alpha,beta) bounds to descendants,
where the bounds are updated with values received from horizon nodes.

List the horizon nodes pruned by alpha-beta algorithm

Alpha-Beta Pruning
• The effectiveness of alpha-beta pruning is highly dependent on the order in which each
node is examined.
• Move order is an important aspect of alpha-beta pruning.
It can be of two types:
• Worst ordering: In some cases, alpha-beta pruning algorithm does not prune any of the
leaves of the tree, and works exactly as minimax algorithm.
• In this case, it also consumes more time because of alpha-beta factors, such a move of
pruning is called worst ordering.
• In this case, the best move occurs on the right side of the tree. The time complexity for
such an order is O(bm).
• Ideal ordering: The ideal ordering for alpha-beta pruning occurs when lots of pruning
happens in the tree, and best moves occur at the left side of the tree.
• We apply DFS hence it first search left of the tree and go deep twice as minimax
algorithm in the same amount of time. Complexity in ideal ordering is O(bm/2).
Rules to find good ordering:
Following are some rules to find good ordering in alpha-beta pruning:
• Occur the best move from the shallowest node.
• Order the nodes in the tree such that the best nodes are checked first.
• Use domain knowledge while finding the best move. Ex: for Chess, try order: captures
first, then threats, then forward moves, backward moves.
• We can bookkeep the states, as there is a possibility that states may repeat.
Constraint Satisfaction Problems

• A wide variety of methods, including adversarial search and instant search, are used to address
various issues.
• Every method for issue has a single purpose in mind: to locate a remedy that will enable that
achievement of the objective.
• However, there were no restrictions just on bots' capability to resolve issues as well as arrive at
responses in adversarial search and local search, respectively.
• These section examines the constraint optimization methodology, another form or real concern
method.
• By its name, constraints fulfilment implies that such an issue must be solved while adhering to a set
of restrictions or guidelines.

Constraint Satisfaction Problem (CSP) deals with solving problems by identifying constraints and
finding solutions that satisfy those constraints.
Significance of Constraint Satisfaction Problem in AI

CSPs are highly significant in artificial intelligence for several reasons:

• They model a wide range of real-world problems where decision-making is subject to

certain conditions and limitations.
• CSPs offer a structured and general framework for representing and solving problems,
making them versatile in problem-solving applications.
• Many AI applications, such as scheduling, planning, and configuration, can be mapped to
CSPs, allowing AI systems to find optimal solutions efficiently.
Key Elements of CSPs
Variable
• Variables are the items that need to be determined.
• The objects in a CSP that must have values given to them in order to meet a specific set
of constraints are known as variables.
• Boolean, integer and category variables are just a few examples of a variety of variables.
• The set of variables is denoted as {X1,X2,…..,Xn} and often takes values from columns
that are predefined and represent possible values they can assume.

Domain
• Domains describe the variety of possible values that a variable might have.
• A domain may be finite or limitless, depending on the problem.
• For example, in Sudoku, a variable that represents a puzzle cell can have as its domain a
range of values from 1 to 9.
• It is denoted by “D”. Domains can be finite, like {1, 2, 3}, or continuous, such as real
numbers between 0 and 1.
Key Elements of CSPs
Constraints
• Constraints are the rules that control how variables interact with one another.
• The ranges of acceptable values for variables are determined by constraints in a CSP.
• The different types of constraints include unary constraints, binary constraints, and
higher-order constraints, to mention a few.
• For example, in a sudoku puzzle, the limitations might be that only one of each
number from 1 to 9 can appear in each row, column, and 3*3 boxes
• Constraints can be expressed in various ways, such as equations, inequalities, or logical
expressions.
Three factors affect restriction compliance, particularly regarding
X: It refers to a group of parameters.
D: The variables are contained within a collection several domain. Every variables has a
distinct scope.
C: It is a set of restrictions that the collection of parameters must abide by.
Constraints

Unary Constraints
• Unary constraints limit the possible values of a single variable without considering the
values of other variables.
• It is the easiest constraint to find, as it has only one parameter. Example: The
expression X1 ≠ 7 says that the variable X1 cannot have the value 7.
Binary Constraints
• Binary constraints describe the relationship between two variables and consist of only
two variables.
• Example: X1< X2 indicates that X1 must be less than X2 in order to be true.
Constraints
Global Constraints
• In contrast to unary or binary constraints, global constraints involve multiple variables
and impose a more complex relationship or restriction between them.
• Global constraints are often used in CSP problems to capture higher-level patterns,
structures, or rules.
• These restrictions can apply to any number of variables at once and are not limited to
pairwise interactions.

It is further divided into two main categories:

Alldifferent Constraint
• The Alldifferent constraint (AllDiff) requires that each variable in a set of variables has a
unique value.
• You commonly apply alldifferent constraints, when you want to be sure that no two
variables in a set can take the same value.
• Example: The expression alldifferent(X1, X2, X3) ensures that the values of X1, X2, and
X3 must be unique.
Sum Constraint
• The Sum Constraint requires that the sum of the values assigned to a group of variables
meet a particular requirement.
• It is useful for expressing restrictions like “the sum of these variables should equal a
certain value.”
• Example: The expression Sum(X1, X2, X3) = 15 demands that the sum of the values for
X1, X2, and X3 be 15.
Domain Categories in CSP
In Constraint Satisfaction Problems (CSPs), domain categories refer to the set of possible values that
can be assigned to each variable in the problem.
The specific categories or domains can vary depending on the nature of the CSP, but here are some
common domain categories:

Finite Domain: Variables in many CSPs have finite domains that are made up of discrete values.
Examples comprise:
• Binary Domains: Domains that only have two values (for binary CSPs, this would be 0 and 1).
• Integer Domains: Domains made up of a limited number of integer values, such as 1, 2, 3, and 4, are
known as integer domains.
• Enumeration Domains: Domains containing a limited number of distinct values, such as “red, green,
and blue” in an issue involving color assignment.
Continuous Domains: Some CSPs contain variables whose domains are continuous, i.e., they can accept any real
number falling within a given range.
Examples comprise:
• Real-valued Domains: Variables may accept any real number that falls within a given range (for example,
X [0, 1]).
• Interval Domains: Variables are limited to a specific range of real values in the interval domain. e.g., X ∈
[−π, π])
Algorithms in CSP
Constraint Satisfaction Problems (CSPs) are typically solved using various algorithms designed to
find a consistent assignment of values to variables that satisfies all the constraints.
Some of the common algorithms used for solving CSPs include:
The Backtracking Algorithm
• The backtracking algorithm is a popular method for resolving CSPs.
• It looks for the search space by picking a variable, setting a value for it, and then recursively
scanning through the other variables.
• In the event of a conflict, it goes back and tries a different value for the preceding variable.
Forward Checking
• The backtracking technique has been improved using forward checking.
• It tracks the remaining accurate values of the unassigned variables after each assignment and
reduces the domains of variables whose values don’t match the assigned ones.
• As a result, the search space is smaller, and constraint propagation is more effectively
accomplished.
Constraint Propagation
• Constraint propagation techniques reduce the search space by removing values inconsistent with
current assignments through local consistency checks.
• To do this, techniques like generalized arc consistency and path consistency are applied.
Real-World Examples of CSPs
To illustrate CSPs, consider the following examples:
•Sudoku Puzzles: In Sudoku, the variables are the empty cells, the domains are numbers from 1 to 9,
and the constraints ensure that no number is repeated in a row, column, or 3x3 subgrid.
•Scheduling Problems: In university course scheduling, variables might represent classes, domains
represent time slots, and constraints ensure that classes with overlapping students or instructors cannot
be scheduled simultaneously.
•Map Coloring: In the map coloring problem, variables represent regions or countries, domains
represent available colors, and constraints ensure that adjacent regions must have different colors.
These examples demonstrate how CSPs provide a framework for modeling and solving problems that
require satisfying various conditions and limitations, making them a fundamental tool in AI and
operations research.
Example: Formulate the map coloring problem for the map of Australia, shown below

The CSP formulation for this problem is:

X: {WA, NT, SA, Q, NSW, V, T}, where each variable represents a state or territory of Australia.
D: {red, green, blue}, where each variable has the same domain of three colors.
C: < (WA, NT), WA!= NT >, < (WA, SA), WA!= SA >, < (NT, SA), NT!= SA >, < (NT, Q), NT!= Q
>, < (SA, Q), SA!= Q >, < (SA, NSW), SA!= NSW >, < (SA, V), SA!= V >, < (Q, NSW), Q!=
NSW >, < (NSW, V), NSW!= V >},Each constraint is a binary constraint that states that two
adjacent regions must have different colors.
Backtracking Algorithm

A recursive depth-first search that tries to assign values to variables one by one and backtracks
if a conflict is found.
It is a systematic search algorithm that explores possible assignments for variables,
backtracking when it encounters constraints that cannot be satisfied.

The algorithm follows these steps:

•Choose an unassigned variable.
•Select a value from its domain.
•Check if the assignment violates any constraints.
•If a constraint is violated, backtrack to the previous variable and try another value.
•Continue this process until all variables are assigned values, or a valid solution is
found.
– Which variable should be assigned next?
– In what order should its values be tried?
– Can we detect inevitable failure early?
General-purpose methods/heuristics can give huge gains in speed
The algorithm can be improved by using various heuristics, such as variable ordering, value
ordering, and constraint propagation.
Variable Ordering: The order in which variables are chosen is known as variable ordering.
• Choose the variable with the most constraints on the remaining variables, i.e., the most
connected variable.
• This is known as Most Constrained Variable (MCV) heuristic, or Minimum Remaining
Values (MRV) heuristic.
• It selects the variable (region) that has the fewest legal values left, meaning the one that is
most restricted in its color choices.
Steps in MRV for Map Coloring
1. Identify uncolored regions: Look at all regions that still need a color.
2. Count remaining valid colors: For each uncolored region, determine how many colors remain
valid after considering constraints (e.g., neighboring regions' assigned colors).
3. Select the region with the fewest options: Pick the region with the least number of available
colors to assign next.
4. Continue until all regions are colored (or backtrack if no solution is possible).
Value Ordering: The sequence in which values are assigned to variables is known as value
ordering.
• Choose the value that rules out the fewest choices for the neighboring variables.
• This is known as the Least Constraining Value (LCV) heuristic.
• Unlike the Minimum Remaining Values (MRV) heuristic, which selects the most
constrained variable, LCV selects the value that imposes the fewest restrictions on future
choices.

Steps for applying LCV

1. Identify the uncolored region to be assigned a color (usually selected using MRV).
2. Check the available colors for that region while considering already assigned colors in adjacent
regions.
3. For each possible color, count how many valid color choices remain for the neighboring regions
if that color is chosen.
4. Choose the color that restricts the fewest options for neighbors (i.e., the "least constraining" color).
5. Assign the color and continue the process until all regions are colored (or backtrack if necessary).
Forward-Checking Algorithm
The forward-checking algorithm is an enhancement of the backtracking algorithm that aims
to reduce the search space by applying local consistency checks.
• For each unassigned variable, the algorithm keeps track of remaining valid values.
• Once a variable is assigned a value, local constraints are applied to neighboring variables,
eliminating inconsistent values from their domains.
• If a neighbor has no valid values left after forward-checking, the algorithm backtracks.

How Forward Checking Works:

1.Select a region (variable) and assign a color.
2.Remove that color from the available choices of all adjacent regions.
3.If any adjacent region is left with no valid color choices, backtrack.
4.Repeat until all regions are assigned colors or backtracking is needed.

This method is more efficient than pure backtracking because it prevents some conflicts
before they happen, reducing unnecessary computations.
The algorithm starts with an empty assignment and
selects the first variable to assign.

1) According to the MRV heuristic, the variable with The algorithm then uses forward checking to
the most constraints is SA, as it has six prune the domains of the neighboring variables
neighbors. and updates the domains as follows:
The algorithm then tries to assign a value to SA,
and according to the LCV heuristic, it chooses • WA: {green, red}, NT: {green, red},Q: {green,
red, as it is the least constraining value for the red}, NSW: {green, red}, V: {green, red}, T:
neighboring variables. {red, green, red}
2) The algorithm then recurses to the next level and selects the next variable to assign.
According to the MRV heuristic, the variable with the most constraints is NT, as it has three
neighbors.
The algorithm then tries to assign a value to NT, and according to the LCV heuristic, it chooses
green, as it is the least constraining value for the neighboring variables.

The algorithm then uses forward checking to prune the domains of the neighboring variables and
updates the domains as follows:
WA: {red}, Q: {red}, NSW: {green, red}, V: {green, red}, T: {red, green, blue}
3) The algorithm then recurses to the next level and selects the next variable to assign.
According to the MRV heuristic, the variable with the most constraints is Q, as it also has three
neighbors.
The algorithm then tries to assign a value to Q, and according to the LCV heuristic, it chooses red, as
it is the only remaining value for Q.

The algorithm then uses forward checking to prune the domains of the neighboring variables and
updates the domains as follows:
WA: {red}, NSW: {green}, V: {green, red}, T: {red, green, blue}

and so on….
Final coloring
Example for usefulness of forward checking

Forward checking is based on the idea that once variable X i is assigned a value v, then
certain future variable-value pairs (Xj,v’) become impossible.
Starting with WA without applying any heuristics we can see SA has no color options
Constraint Propagation: Reducing the domain of variables based on constraint compliance is
known as constraint propagation.
• Extends forward checking by spreading the effect of constraints beyond immediate neighbors,
ensuring a more global reduction of the search space.
• Constraints are propagated between related variables.
• Inconsistent values are eliminated from variable domains by leveraging information gained from other
variables.
• These algorithms refine the search space by making inferences, removing values that would lead to
conflicts.
Steps in Constraint Propagation for Map Coloring
1. Initial Setup: Each region starts with a set of available colors.
2. Assign a Color: When a region is assigned a color, constraint propagation updates the domains of its
neighboring regions.
3. Eliminate Conflicting Colors: The chosen color is removed from the available colors of adjacent regions and
further propagate this restriction recursively.
4. Use techniques like Arc Consistency (AC-3) to ensure all remaining uncolored regions still have at least one
valid color.
5. Propagate Constraints: If a neighboring region now has only one available color, it must be assigned that
color, further restricting other regions.
6. Repeat Until No More Reductions: Continue eliminating invalid options until no more values can be pruned.
NT and SA cannot both be blue!
Constraint propagation repeatedly enforces
constraints locally
Example: 4-Queens Problem
Cryptarithmetic puzzles
Solve the cryptarithmethic problem shown in Fig. using the strategy of backtracking with forward
checking and the MRV and least-constraining-value heuristic.

A cryptarithmetic problem. Each letter stands for a distinct digit; the aim is to find a substitution of
digits for letters such that the resulting sum is arithmetically correct, with the added restriction that no
leading zeroes are allowed.
Alldiff (F,T,U,W,R,O)

O+O = R+10 ·C_1

C_1+W +W =U +10 ·C_2
C_2+T +T = O+10 ·C_3
C_3 = F

C_1, C_2, and C_3 are auxiliary variables representing the digit carried over into the tens, hundreds, or
thousands column. The carries can take the values {0,1}
Consistent or Legal Assignment: A task is referred to as consistent or legal if it complies
with all laws and regulations.
Complete Assignment: An assignment in which each variable has a number associated to it
and that the CSP solution is continuous. One such task is referred to as a completed task.
Partial assignment is one that just gives some of the variables values. Projects of this nature
are referred to as incomplete assignment.
INTELLIGENT AGENTS

• Different people approach AI with different goals in mind.

• Two important questions to ask are:

• Are you concerned with thinking or behavior?

• Do you want to model humans or work from an ideal standard?

• Adopt the view that intelligence is concerned mainly with rational action.
• Ideally, an intelligent agent takes the best possible action in a situation.
• We study the problem of building agents that are intelligent in this sense.
• The concept of rationality can be applied to a wide variety of agents operating in any
imaginable environment.
AGENTS AND ENVIRONMENTS
An agent is anything that can be
viewed as perceiving its environment
through sensors and acting upon that
environment through actuators.

• A human agent has eyes, ears, and other organs for sensors and hands, legs, vocal tract, and
so on for actuators.
• A robotic agent might have cameras and infrared range finders for sensors and various
motors for actuators.
• A software agent receives keystrokes, file contents, and network packets as sensory inputs
and acts on the environment by displaying on the screen, writing files, and sending
network packets.
Percept
• The term percept refer to the agent’s perceptual inputs at any given instant.
• An agent’s percept sequence is the complete history of everything the agent has ever
perceived.
• An agent’s choice of action at any given instant can depend on the entire percept
sequence observed to date, but not on anything it hasn’t perceived.

• The various vacuum-world agents can be defined simply by filling in the right-hand
column in various ways.
• The obvious question, then, is this:
• What is the right way to fill out the table?
• In other words, what makes an agent good or bad, intelligent or stupid
Agent Function and Agent Program
• Mathematically, an agent’s behavior is • Internally, the agent function for an
described by the agent function that maps artificial agent will be implemented by an
any given percept sequence to an action. agent program.
[f: P*  A]
• The agent program runs on the physical
• Tabulating the agent function that describes architecture to produce
any given agent; for most agents, this would
be a very large table—infinite, in fact, Agent = architecture + program
unless we place a bound on the length of
percept sequences we want to consider. • It is important to keep these two ideas
distinct.
• Given an agent to experiment with, we can,
construct this table by trying out all possible • The agent function is an abstract
percept sequences and recording which mathematical description; the agent
actions the agent does in response. program is a concrete implementation,
• The table is, of course, an external
running within some physical system.
characterization of the agent.
Performance Measure
Good Behavior: The Concept Of Rationality
• A rational agent is one that does the right thing—every entry in the table for
the agent function is filled out correctly.
• Obviously, doing the right thing is better than doing the wrong thing, but
what does it mean to do the right thing?
• By considering the consequences of the agent’s behavior.

• When an agent is plunked down in an environment, it generates a sequence

of actions according to the percepts it receives.
• This sequence of actions causes the environment to go through a sequence of
states.
• If the sequence is desirable, then the agent has performed well.
• This notion of desirability is captured by a performance measure that valuates any given
sequence of environment states
• Obviously, there is not one fixed performance measure for all tasks and agents; a
designer will devise one appropriate to the circumstances.
• As a general rule, it is better to design performance measures according to what one
actually wants in the environment, rather than according to how one thinks the agent
should behave.
Performance Measure
• Rationality
• Omniscience
• Learning
• Autonomy
Rationality
What is rational at any given time depends on four things:
• The performance measure that defines the criterion of success.
• The agent’s prior knowledge of the environment.
• The actions that the agent can perform.
• The agent’s percept sequence to date.
For each possible percept sequence, a rational agent should select an action that is expected
to maximize its performance measure, given the evidence provided by the percept sequence
and whatever built-in knowledge the agent has.

Omniscience
• The state of knowing everything
• An omniscient agent knows the actual outcome of its actions and can act accordingly; but
omniscience is impossible in reality.
Learning
• Doing actions in order to modify future percepts—sometimes called information
gathering
• A rational agent not only gather information but also to learn as much as possible from
what it perceives.
• The agent’s initial configuration could reflect some prior knowledge of the environment,
but as the agent gains experience this may be modified and augmented.
• There are extreme cases in which the environment is completely known a priori.
• In such cases, the agent need not perceive or learn; it simply acts correctly.
• Such agents are fragile.
• Successful agents split the task of computing the agent function into three different
periods.
• When the agent is being designed, some of the computation is done by its designers:
when it is deliberating on its next actions, the agent does more computation and its learn
from its experience, it does even more computation to decide how to modify its
behavior.
Autonomy
• If an agent relies on the prior knowledge of its designer rather than on its own percepts,
we say that the agent lacks autonomy.
• A rational agent should be autonomous—it should learn what it can to compensate for
partial or incorrect prior knowledge.
• Agent seldom requires complete autonomy from the start: when the agent has had little or
no experience, it would have to act randomly unless the designer gave some assistance.
• It would be reasonable to provide an artificial intelligent agent with some initial
knowledge as well as an ability to learn.
• After sufficient experience of its environment, the behavior of a rational agent can
become effectively independent of its prior knowledge.
• Hence, the incorporation of learning allows one to design a single rational agent that will
succeed in a vast variety of environments.
THE NATURE OF ENVIRONMENTS
• Task environments, which are essentially the “problems” to which rational agents are
the “solutions”.
• Specifying the task environment
• The performance measure, the environment, and the agent’s actuators and sensors are
grouped under the heading of the task environment. Acronymically PEAS (Performance,
Environment, Actuators, Sensors)
• In designing an agent, the first step must always be to specify the task environment as
fully as possible.
• List of agent types includes some programs that operate in the entirely artificial
environment defined by keyboard input and character output on a screen.
• In fact, what matters is not the distinction between “real” and “artificial” environments,
but the complexity of the relationship among the behavior of the agent, the percept
sequence generated by the environment, and the performance measure.
• Some “real” environments are actually quite simple..
• In contrast, some software agents (or software robots or softbots) exist in rich, unlimited
domains.
TYPES OF ENVIRONMENTS
Fully observable vs. partially observable:
• If an agent’s sensors give it access to the complete state of the environment at each
point in time, then we say that the task environment is fully observable.
• A task environment is effectively fully observable if the sensors detect all aspects that
are relevant to the choice of action; relevance, in turn, depends on the performance
measure.
• Fully observable environments are convenient because the agent need not maintain
any internal state to keep track of the world.
• An environment might be partially observable because of noisy and inaccurate sensors
or because parts of the state are simply missing from the sensor data.
• If the agent has no sensors at all then the environment is unobservable.
• One might think that in such cases the agent’s plight is hopeless, but, the agent’s goals
may still be achievable, sometimes with certainty.
Single agent vs. multi agent:
• The distinction between single-agent and multiagent environments may seem simple
enough.
• For example, an agent solving a crossword puzzle by itself is clearly in a single-agent
environment, whereas an agent playing chess is in a two-agent environment.
• Chess is a competitive multiagent environment.
• In the taxi-driving environment, avoiding collisions maximizes the performance
measure of all agents, so it is a partially cooperative multiagent environment.
• It is also partially competitive because, for example, only one car can occupy a parking
space.
• The agent-design problems in multiagent environments are often quite different from
those in single-agent environments; for example, communication often emerges as a
rational behavior in multiagent environments; in some competitive environments,
randomized behavior is rational because it avoids the pitfalls of predictability
• Episodic vs. sequential:
• In an episodic task environment, the agent’s experience is divided into atomic
episodes.
• In each episode the agent receives a percept and then performs a single action.
• Crucially, the next episode does not depend on the actions taken in previous
episodes. Many classification tasks are episodic. For example, an agent that has to
spot defective parts on an assembly line bases each decision on the current part,
regardless of previous decisions; moreover, the current decision doesn’t affect
whether the next part is defective.
• In sequential environments, on the other hand, the current decision could affect all
future decisions.
• Chess and taxi driving are sequential: in both cases, short-term actions can have
long-term consequences.
• Episodic environments are much simpler than sequential environments because the
agent does not need to think ahead.
Static vs. dynamic:
• If the environment can change while an agent is deliberating, then we say the
environment is dynamic for that agent; otherwise, it is static.
• Static environments are easy to deal with because the agent need not keep looking at
the world while it is deciding on an action, nor need it worry about the passage of time.
• Dynamic environments, on the other hand, are continuously asking the agent what it
wants to do; if it hasn’t decided yet, that counts as deciding to do nothing.
• If the environment itself does not change with the passage of time but the agent’s
performance score does, then we say the environment is semi-dynamic.
• Taxi driving is clearly dynamic: the other cars and the taxi itself keep moving while
the driving algorithm dithers about what to do next.
• Chess, when played with a clock, is semi-dynamic.
• Crossword puzzles are static.
Discrete vs. continuous:
• The discrete/continuous distinction applies to the state of the environment, to the way
time is handled, and to the percepts and actions of the agent.
• For example, the chess environment has a finite number of distinct states (excluding the
clock).
• Chess also has a discrete set of percepts and actions.
• Taxi driving is a continuous-state and continuous-time problem: the speed and location
of the taxi and of the other vehicles sweep through a range of continuous values and do
so smoothly over time.
• Taxi-driving actions are also continuous (steering angles, etc.). Input from digital
cameras is discrete, strictly speaking, but is typically treated as representing continuously
varying intensities and locations
Known vs. unknown
• Strictly speaking, this distinction refers not to the environment itself but to the agent’s
(or designer’s) state of knowledge about the “laws of physics” of the environment.
• In a known environment, the outcomes (or outcome probabilities if the environment is
stochastic) for all actions are given.
• Obviously, if the environment is unknown, the agent will have to learn how it works in
order to make good decisions.
• Note that the distinction between known and unknown environments is not the same as
the one between fully and partially observable environments.
• It is quite possible for a known environment to be partially observable—for example, in
solitaire card games, we know the rules but still unable to see the cards that have not yet
been turned over.
• Conversely, an unknown environment can be fully observable—in a new video game,
the screen may show the entire game state but I still don’t know what the buttons do until
we try them.
THE STRUCTURE OF AGENTS
• Agents by describing behavior—the action that is performed after any given sequence
of percepts
• The job of AI is to design an agent program that implements the agent function— the
mapping from percepts to actions.
• We assume this program will run on some sort of computing device with physical
sensors and actuators—we call this the architecture
agent = architecture + program .
• Obviously, the program chosen has to be one that is appropriate for the architecture.
• If the program is going to recommend actions like Walk, the architecture had better
have legs. The architecture might be just an ordinary PC, or it might be a robotic car
with several onboard computers, cameras, and other sensors.
• In general, the architecture makes the percepts from the sensors available to the
program, runs the program, and feeds the program’s action choices to the actuators as
they are generated.
Agent programs
• Four basic kinds of agent programs that embody the principles underlying almost all
intelligent systems:
• Simple reflex agents
• Model-based reflex agents
• Goal-based agents
• Utility-based agents

• The agent programs take the current percept as input from the sensors and return an action
to the actuators.
• Agent function, which takes the entire percept history. The agent program takes just the
current percept as input because nothing more is available from the environment;
• if the agent’s actions need to depend on the entire percept sequence, the agent will have to
remember the percepts.
• Each kind of agent program combines particular components in particular ways to generate
actions
Simple reflex agents
• The simplest kind of agent is the simple reflex agent. These agents select actions on the
basis of the current percept, ignoring the rest of the percept history.
• For example, the vacuum agent because its decision is based only on the current location
and on whether that location contains dirt.

• Notice that the vacuum agent program is very small indeed compared to the
corresponding table.
• The most obvious reduction comes from ignoring the percept history, which cuts down the
number of possibilities from 4T to just 4.
• A further, small reduction comes from the fact that when the current square is dirty, the
action does not depend on the location.
• Simple reflex behaviors occur even in more complex environments.
• A simple reflex agent is the runt of the litter.
• It has very limited intelligence and operates on a direct condition-action rule.
• These rule-based agents aren’t suited for complex tasks. However, they’re perfectly adept at the
specific tasks they’re designed for.
• Simple reflex agents are suited for straightforward tasks in a predictable environment. This kind of
agent’s actions affect the world around it, but only in specific tasks.
Thermostats
It’s 6pm in the winter? Crank that heat up. It’s noon in the summer? This simple reflex agent, with its
limited intelligence, will turn on the AC.
Automatic doors
While its perceived intelligence is low, automatic doors are often examples of simple reflex agents. This
AI agent senses a human in front of a door, and it opens. Beautifully simple.
Smoke detectors
This AI agent operates from your kitchen ceiling. Yep, it’s a simple reflex agent, too. It senses smoke,
and it sounds an alarm.
Basic spam filters
Some agents in artificial intelligence have been helping us daily for years. The email spam filter is one
of these. Basic versions don’t use natural language processing, but rather keywords or the sender’s
reputation.
Model-based reflex agents
• The most effective way to handle partial observability is for the agent to keep track of the
part of the world it can’t see now. That is, the agent should maintain some sort of internal
state that depends on the percept history and thereby reflects at least some of the
unobserved aspects of the current state.
• For the braking problem, the internal state is not too extensive just the previous frame from
the camera, allowing the agent to detect when two red lights at the edge of the vehicle go
on or off simultaneously.
• For other driving tasks such as changing lanes, the agent needs to keep track of where the
other cars are if it can’t see them all at once. And for any driving to be possible at all, the
agent needs to keep track of where its keys are.
• Updating this internal state information as time goes by requires two kinds of knowledge
to be encoded in the agent program.
• First, we need some information about how the world evolves independently of the agent
—for example, that an overtaking car generally will be closer behind than it was a moment
ago.
• Second, we need some information about how the agent’s own actions affect the world—
for example, that when the agent turns the steering wheel clockwise, the car turns to the
right, or that after driving for five minutes northbound on the freeway, one is usually about
five miles north of where one was five minutes ago.
• This knowledge about “how the world works”—whether implemented in simple Boolean
circuits or in complete scientific theories—is called a model of the world. An agent that
uses such a is called a model-based agent.
• When you need to adapt to information that isn’t always visible or predictable, model-based
reflex agents are the tool to use.
• Unlike simple reflex agents that react solely based on current perceptions, model-based reflex
agents maintain an internal state that allows them to predict partially observable environments.
This is an internal model of the section of the world relevant to their duties.
• This model is constantly updated with incoming data from their environment, so that the AI
agent can make inferences about unseen parts of the environment and anticipate future
conditions.
• They assess the potential outcomes of their actions before making decisions, allowing them to
handle complications. This is especially useful when doing complex tasks, like driving a car in
a city, or managing an automated smart home system.
• Because of their ability to combine past knowledge and real-time data, model-based reflex
agents can optimize their performance, no matter the task. Like a human, they can make
context-aware decisions, even when the conditions are unpredictable.
Autonomous Vehicles
Even though these cars span multiple types of intelligent agents, they’re a good example of model-
based reflex agents.
Complex systems like traffic and pedestrian movements are exactly the kind of challenge that model-
based reflex agents are designed for.
Their internal model is used to make real-time decisions on the road, like braking when another car
runs a red light, or slowing down rapidly when the car ahead does the same. Their internal system is
constantly updating based on their environmental inputs: other cars, activity at crosswalks, the
weather.
Modern irrigation systems
Model-based reflex agents are the powerhouse behind modern irrigation systems. Their ability to
respond to unexpected environmental feedback is perfectly suited for weather and soil moisture levels.
The AI agent’s internal model represents and predicts various environmental factors, like soil moisture
levels, weather conditions, and plant water requirements.
These agents continuously collect data from sensors in their fields, including real-time information on
humidity, temperature, and precipitation.
By analyzing this data, the model-based reflex agent can make informed decisions about when to
water, how much water to dispense, and which zones of a field require more attention. This predictive
capability allows the irrigation system to optimize water usage, ensuring that plants receive exactly
what they need to thrive (without wasting water).
Home automation systems
The internal model here is that of a home’s environment – these systems are continuously updated with
data from sensors, and use this information to inform their decisions.
A thermostat will detect changing temperatures and configure as needed. Or a lighting system might
detect darkness outdoors and adjust accordingly – since this darkness might come from nighttime, or
from an unexpected thunderstorm, it requires an intelligent agent to both anticipate and react to
differences.
Goal-based agents
• Knowing something about the current state of the environment is not always enough to
decide what to do.
• For example, at a road junction, the taxi can turn left, turn right, or go straight on.
• The correct decision depends on where the taxi is trying to get to.
• In other words, as well as a current state description, the agent needs some sort of goal
information that describes situations that are desirable—for example, being at the
passenger’s destination.
• The agent program can combine this with the model (the same information as was used in
the model based reflex agent) to choose actions that achieve the goal.
• Sometimes goal-based action selection is straightforward—
for example, when goal satisfaction results immediately
from a single action.
• Sometimes it will be more tricky—for example, when the
agent has to consider long sequences of twists and turns in
order to find a way to achieve the goal.
• Search and planning are the subfields of AI devoted to
finding action sequences that achieve the agent’s goals.
• Goal-based AI agents are designed to achieve specific goals with artificial intelligence.
• Instead of just responding to stimuli, these rational agents are capable of considering the future
consequences of their actions, so they can make strategic decisions to reach their goals.
• Unlike simple reflex agents, which respond directly to stimuli based on condition-action rules, goal-
based agents evaluate and plan actions to meet their goals.
• What makes them distinct from other types of intelligent agents is their ability to combine foresight
and strategic planning to navigate towards specific outcomes.
Roomba- Robotic vacuum cleaners are designed with a specific goal: clean all accessible floor space.
This goal-based agent has a simple goal, and it does it well.
All their decisions made by this goal-based agent (like when to rotate) are made in pursuit of this lofty
goal. The cats that sit on top of them are just a bonus.
Project Management Software
While it may also use a utility-based agent, project management software usually focuses on achieving
a specific project objective.
These AI agents will often schedule tasks and allocate resources so that a team is optimized to complete
a project on time. The agent evaluates the most likely course of success and actions it on behalf of a
team.
Video Game AI
In strategy and role-playing games, AI characters act as goal-based agents – their objectives might
range from defending a location to defeating an opponent.
These dolled-up AI agents consider a variety of strategies and resources – which attack to use, which
power-up to burn – so that they can achieve their goal.
Utility-based agents
• Goals alone are not enough to generate high-quality behavior in most environments.
• For example, many action sequences will get the taxi to its destination (thereby achieving
the goal) but some are quicker, safer, more reliable, or cheaper than others.
• Goals just provide a crude binary distinction between “happy” and “unhappy” states.
• A more general performance measure should allow a comparison of different world states
according to exactly how happy they would make the agent.
• Because “happy” does not sound very scientific, economists and computer scientists use
the term utility instead
• Performance measure assigns a score to any given
sequence of environment states, so it can easily distinguish
between more and less desirable ways of getting to the
taxi’s destination.
• An agent’s utility function is essentially an internalization
of the performance measure. If the internal utility function
and the external performance measure are in agreement,
then an agent that chooses actions to maximize its utility
will be rational according to the external performance
measure.
• Like goal-based agents, a utility-based agent has many advantages in terms of
• flexibility and learning.

• Furthermore, in two kinds of cases, goals are inadequate but a utility-based agent can still
make rational decisions.
• First, when there are conflicting goals, only some of which can be achieved (for
example, speed and safety), the utility function specifies the appropriate tradeoff.
• Second, when there are several goals that the agent can aim for, none of which can
be achieved with certainty, utility provides a way in which the likelihood of success can
be weighed against the importance of the goals.

• Partial observability and stochasticity are ubiquitous in the real world, and so, therefore, is
decision making under uncertainty.
• Technically speaking, a rational utility-based agent
• chooses the action that maximizes the expected utility of the action outcomes
—that is, the utility the agent expects to derive, on average, given the
probabilities and utilities of each outcome.

• Any rational agent must behave as if it possesses a utility function whose expected value
it tries to maximize.
• An agent that possesses an explicit utility function can make rational decisions with a
general-purpose algorithm that does not depend on the specific utility function being
maximized.
• In this way, the “global” definition of rationality—designating as rational those agent
functions that have the highest performance—is turned into a “local” constraint on
rational-agent designs that can be expressed in a simple program.
• Unlike simpler agents that might merely react to environmental stimuli, utility-based agents
evaluate their potential actions based on the expected utility. They’ll predict how useful or
beneficial each option is in regards to their set goal.
• Utility-based agents excel in complex decision-making environments with multiple potential
outcomes – like balancing different risks in order to make investment decisions, or weigh side
effects of treatment options.
• The utility function of these intelligent agents is a mathematical representation of its preferences.
The utility function maps to the world around it, deciding and ranking which option is the most
preferable. Then a utility agent can choose the optimal action.
• Since they can process large amounts of data, they’re useful in any field that involves high-stakes
decision-making.
Financial Trading
Utility-based agents are well-suited for stock and cryptocurrency markets – they’re able to buy or sell
based on algorithms that aim to maximize financial returns or minimize losses. This type of utility
function can take into account both historical data and real-time market data.
Dynamic Pricing Systems
Ever paid extra for an Uber or Lyft in the rain? That’s a utility-based agent at work – they can adjust
prices in real-time for flights, hotels, or ride-sharing, based on demand, competition, or time of
booking.
Smart Grid Controllers
These types of intelligent agents are the ‘smart’ in smart grids: it’s utility-based agents that control the
distribution and storage of electricity.
They optimize the use of resources based on demand forecasts and energy prices to improve
efficiency and reduce costs.
Personalized Content Recommendations
You finish watching a movie and Netflix recommends 3 more just like it.
Streaming services like Netflix and Spotify use utility-based agents to suggest similar content to
users. The optimized utility here is how likely you are to click on it.
Learning agents
• A learning agent can be divided into four conceptual components
• The most important distinction is between the learning element, which is responsible for
making improvements, and the performance element, which is responsible for selecting
external actions.
• The performance element is what we have previously considered to be the entire agent: it
takes in percepts and decides on actions.
• The learning element uses feedback from the critic on how the agent is doing and determines
how the performance element should be modified to do better in the future.
• The design of the learning element depends very much on the design of the performance
element.
• When trying to design an agent that learns a certain capability, the first question is not
“How am I going to get it to learn this?” but “What kind of performance element will my
agent need to do this once it has learned how?”
• Given an agent design, learning mechanisms can be constructed to improve every part of
the agent.
• The critic tells the learning element how well the agent is doing with respect to a fixed
performance standard. The critic is necessary because the percepts themselves provide no
indication of the agent’s success.
• For example, a chess program could receive a percept indicating that it has checkmated its
opponent, but it needs a performance standard to know that this is a good thing; the
percept itself does not say so.
• It is important that the performance standard be fixed. Conceptually, one should think of it
as being outside the agent altogether because the agent must not modify it to fit its own
behavior.
• The last component of the learning agent is the problem generator.
• It is responsible for suggesting actions that will lead to new and informative experiences.
• The point is that if the performance element had its way, it would keep doing the actions
that are best, given what it knows.
• But if the agent is willing to explore a little and do some perhaps suboptimal actions in the
short run, it might discover much better actions for the long run.
• The problem generator’s job is to suggest these exploratory actions. This is what scientists
do when they carry out experiments.
• Learning agents stand out due to their ability to adapt and improve over time based on their
experiences.
• Unlike more static AI agents that operate solely on pre-programmed rules or models, a learning
agent can evolve its behavior and strategies. Because of this learning element, they’re most often
used in changing environments.
Fraud Detection
Fraud detection systems operate by continuously collecting data and then adjusting to recognize
fraudulent patterns more effectively. Since scammers are always changing their tactics, fraud detection
agents need to keep adapting, too.
Content Recommendation
Platforms like Netflix and Amazon use a system equipped with a learning agent to improve their
recommendations for movies, shows, and products.
Even if your profile says you should like horror and thriller movies, if you suddenly switch to rom-
coms, your recommendations will adapt. Just like us, it’s always learning.
Speech Recognition Software
Applications like Google Assistant and Siri make use of a learning agent to better understand sour
garbled attempts to speak to them.
It’s thanks to learning agents that these systems get better at understanding accents and slang – so we
can ask Siri things like, “Och, Siri, can ye find me the nearest chippy for some supper? I'm pure
peckish!"
Adaptive Thermostats
Even smart thermostats – like Nest – learn from user behavior, like when users tend to be home or
away, and their preferred temperatures.
This information might always be changing, so thermostats must be able to adapt over time – this
makes them another example of a learning agent.
Solving Problems by Searching
• Kind of goal-based agent called a problem-solving agent.
• Problem-solving agents use atomic representations—that is, states of the world are
considered as wholes, with no internal structure visible to the problem solving algorithms.
• Goal-based agents that use more advanced factored or structured representations are
usually called planning agents.
• Problem solving begins with precise definitions of problems and their solutions
• uninformed search algorithms—algorithms that are given no information about the
problem other than its definition. Although some of these algorithms can solve any
solvable problem, none of them can do so efficiently.
• Informed search algorithms, can do quite well given some guidance on where to look
for solutions.
PROBLEM-SOLVING AGENTS
Formulation
Goal formulation, based on the current situation and the agent’s performance measure, is
the first step in problem solving.
• Imagine an agent in the city of Arad, Romania, enjoying a touring holiday.
• The agent’s performance measure contains many factors: it wants to improve its suntan,
improve its Romanian, take in the sights, enjoy the nightlife (such as it is), avoid
hangovers, and so on.
• The decision problem is a complex one involving many tradeoffs and careful reading of
guidebooks.
• Now, suppose the agent has a nonrefundable ticket to fly out of Bucharest the following
day. In that case, it makes sense for the agent to adopt the goal of getting to Bucharest.
• Courses of action that don’t reach Bucharest on time can be rejected without further
consideration and the agent’s decision problem is greatly simplified.
• Goals help organize behavior by limiting the objectives that the agent is trying to
achieve and hence the actions it needs to consider.
Problem formulation is the process of deciding what actions and states to consider,given a
goal.
• Consider a goal to be a set of world states—exactly those states in which the goal is
satisfied.
• The agent’s task is to find out how to act, now and in the future, so that it reaches a goal
state.
• Before it can do this, it needs to decide (or we need to decide on its behalf) what sorts of
actions and states it should consider.
• If it were to consider actions at the level of “move the left foot forward an inch” or “turn
the steering wheel one degree left,” the agent would probably never find its way out of
the parking lot, let alone to Bucharest, because at that level of detail there is too much
uncertainty in the world and there would be too many steps in a solution.
• Let us assume that the agent will consider actions at the level of driving from one major
town to another. Each state therefore corresponds to being in a particular town

Adversarial Search 2020
No ratings yet
Adversarial Search 2020
34 pages
IAI UNIT-II Games
No ratings yet
IAI UNIT-II Games
57 pages
Ai-Ml Mod-2
No ratings yet
Ai-Ml Mod-2
72 pages
UNIT-II-Adversarial Search
No ratings yet
UNIT-II-Adversarial Search
28 pages
Adversarial Search
No ratings yet
Adversarial Search
22 pages
L4 Adversarial Search
No ratings yet
L4 Adversarial Search
27 pages
AI 2ndunit
No ratings yet
AI 2ndunit
25 pages
Understanding Adversarial Search in AI
No ratings yet
Understanding Adversarial Search in AI
20 pages
Chap-4 Adversarial Search
No ratings yet
Chap-4 Adversarial Search
43 pages
Adversarial Search & CSP in AI
No ratings yet
Adversarial Search & CSP in AI
17 pages
Ai Unit 2
No ratings yet
Ai Unit 2
88 pages
Ai Unit 4
No ratings yet
Ai Unit 4
39 pages
Game Playing Algorithm
No ratings yet
Game Playing Algorithm
27 pages
Understanding Adversarial Search in AI
No ratings yet
Understanding Adversarial Search in AI
15 pages
AI Unit 2 Adversarial Search
No ratings yet
AI Unit 2 Adversarial Search
51 pages
Unit 2 L4
No ratings yet
Unit 2 L4
67 pages
Game Playing
No ratings yet
Game Playing
63 pages
Module III
No ratings yet
Module III
38 pages
Unit 2 Studcou
No ratings yet
Unit 2 Studcou
64 pages
Aiml Unit-2
No ratings yet
Aiml Unit-2
61 pages
Game Playing, Knowledge Representation and Reasoning
No ratings yet
Game Playing, Knowledge Representation and Reasoning
18 pages
Adversarial Search and Minimax Algorithm
No ratings yet
Adversarial Search and Minimax Algorithm
64 pages
Games, The Mini-Max Algorithm
No ratings yet
Games, The Mini-Max Algorithm
160 pages
AI Game Strategies and Algorithms
No ratings yet
AI Game Strategies and Algorithms
25 pages
Artificial Intelligence - Adversarial Search
No ratings yet
Artificial Intelligence - Adversarial Search
4 pages
Adversarial Search and Mini-Max Algorithm
No ratings yet
Adversarial Search and Mini-Max Algorithm
11 pages
Understanding Adversarial Search in AI
No ratings yet
Understanding Adversarial Search in AI
19 pages
UNIT II Adversarial Search
No ratings yet
UNIT II Adversarial Search
44 pages
Understanding Adversarial Search in AI
No ratings yet
Understanding Adversarial Search in AI
13 pages
Adversarial Search in Game Theory
100% (1)
Adversarial Search in Game Theory
38 pages
Adversarial Search - Game Trees and Minimax Evaluation
No ratings yet
Adversarial Search - Game Trees and Minimax Evaluation
50 pages
Computer Vision 3
No ratings yet
Computer Vision 3
32 pages
Unit 2
No ratings yet
Unit 2
26 pages
BTech Advanced AI Unit02
No ratings yet
BTech Advanced AI Unit02
50 pages
4.3 Adversarial Search
No ratings yet
4.3 Adversarial Search
12 pages
Unit 2
No ratings yet
Unit 2
55 pages
Ai Unit 3
No ratings yet
Ai Unit 3
33 pages
Adversarial Search in Game Theory
No ratings yet
Adversarial Search in Game Theory
71 pages
Adversarial Search
No ratings yet
Adversarial Search
33 pages
Adversarial Search in Game Playing
No ratings yet
Adversarial Search in Game Playing
8 pages
Game Theory
No ratings yet
Game Theory
27 pages
AI Game Strategy Essentials
No ratings yet
AI Game Strategy Essentials
16 pages
Ai-Notes Exact Section C
No ratings yet
Ai-Notes Exact Section C
26 pages
AI Unit-II Chapter-I Adversarial Search
No ratings yet
AI Unit-II Chapter-I Adversarial Search
16 pages
Artificial Intelligence - Adversarial Search - Tpoint Tech
No ratings yet
Artificial Intelligence - Adversarial Search - Tpoint Tech
3 pages
4 Adversel Search Game Tree
No ratings yet
4 Adversel Search Game Tree
51 pages
Ai Unit 3
No ratings yet
Ai Unit 3
33 pages
Ch5-Game Playing
No ratings yet
Ch5-Game Playing
17 pages
CCS 3101 - Lecture 5 - Adversarial Search Techniques
No ratings yet
CCS 3101 - Lecture 5 - Adversarial Search Techniques
34 pages
19Z701-AI-Unit-2-2 Adversarial Search Strategies
No ratings yet
19Z701-AI-Unit-2-2 Adversarial Search Strategies
42 pages
Unit Iii
No ratings yet
Unit Iii
20 pages
AI-Lecture 6 (Adversarial Search)
No ratings yet
AI-Lecture 6 (Adversarial Search)
68 pages
Institute of Southern Punjab Multan: Syed Zohair Quain Haider Lecturer ISP Multan
No ratings yet
Institute of Southern Punjab Multan: Syed Zohair Quain Haider Lecturer ISP Multan
41 pages
AI Unit 3
No ratings yet
AI Unit 3
76 pages
Game Playing in Artificial Intelligence
No ratings yet
Game Playing in Artificial Intelligence
12 pages
UNIT 2 AI Notes
No ratings yet
UNIT 2 AI Notes
26 pages
AI All Units
No ratings yet
AI All Units
93 pages
SET394 - AI - Lecture 06 - Adversarial Search
No ratings yet
SET394 - AI - Lecture 06 - Adversarial Search
27 pages
Search (Uninformed + Informed)
No ratings yet
Search (Uninformed + Informed)
8 pages
9 AIML Question Bank Updated 5 Units
No ratings yet
9 AIML Question Bank Updated 5 Units
21 pages
AI MCQs: Concepts and Algorithms
No ratings yet
AI MCQs: Concepts and Algorithms
54 pages
Constraint Programming: Toby Walsh Unsw and Nicta
No ratings yet
Constraint Programming: Toby Walsh Unsw and Nicta
62 pages
UNIT-1 PPT AI - Dum
No ratings yet
UNIT-1 PPT AI - Dum
70 pages
Lecture 01 Part C - Constraint Satisfaction Problem (CSP)
100% (1)
Lecture 01 Part C - Constraint Satisfaction Problem (CSP)
132 pages
Jackson-Reasearch Paper
No ratings yet
Jackson-Reasearch Paper
8 pages
Multiagent Constraint Optimization
No ratings yet
Multiagent Constraint Optimization
124 pages
Thomas Peugeot
No ratings yet
Thomas Peugeot
27 pages
Understanding Constraint Satisfaction Problems
No ratings yet
Understanding Constraint Satisfaction Problems
49 pages
Ch6 ConstraintSatisfactionProblems
No ratings yet
Ch6 ConstraintSatisfactionProblems
114 pages
Jain, V.,& Grossmann, I. E. (2001) - Algorithms For Hybrid MILP-CP Models For A Class of Optimization Problems
No ratings yet
Jain, V.,& Grossmann, I. E. (2001) - Algorithms For Hybrid MILP-CP Models For A Class of Optimization Problems
20 pages
Artificial Intelligence Applications For Improved Software Engineering Development by Farid Meziane & Sunil Vadera
100% (1)
Artificial Intelligence Applications For Improved Software Engineering Development by Farid Meziane & Sunil Vadera
371 pages
Introduction to Constraint Satisfaction Problems
No ratings yet
Introduction to Constraint Satisfaction Problems
24 pages
Solving The Sports League Scheduling Problem With
No ratings yet
Solving The Sports League Scheduling Problem With
15 pages
Unit4 AI
No ratings yet
Unit4 AI
43 pages
Constraint Satisfaction Algorithms: Edition of Timetables in The License-Master-Doctorate System
No ratings yet
Constraint Satisfaction Algorithms: Edition of Timetables in The License-Master-Doctorate System
9 pages
38018138usen-00 38018138USEN
No ratings yet
38018138usen-00 38018138USEN
1 page
Chapter 6 Constraint Satisfaction Problems
No ratings yet
Chapter 6 Constraint Satisfaction Problems
34 pages
Applied Expert Systems Handbook
No ratings yet
Applied Expert Systems Handbook
59 pages
AD3311-Artifical Intelligence Laboratory MANUAL PDF
No ratings yet
AD3311-Artifical Intelligence Laboratory MANUAL PDF
63 pages
Lab 4
No ratings yet
Lab 4
21 pages
Artificial Intelligence Gtu MCQ Artificial Intelligence Gtu MCQ
No ratings yet
Artificial Intelligence Gtu MCQ Artificial Intelligence Gtu MCQ
159 pages
Ai Unit 3
No ratings yet
Ai Unit 3
27 pages
I Didn't Know Constraints Could Do That!: John Dickol Samsung Austin R&D Center
100% (1)
I Didn't Know Constraints Could Do That!: John Dickol Samsung Austin R&D Center
41 pages
Intranet Implementation Strategy Case
No ratings yet
Intranet Implementation Strategy Case
11 pages
Unit 2 - AI
No ratings yet
Unit 2 - AI
47 pages
CSP 1
No ratings yet
CSP 1
6 pages
Constraint Based 3 D Object Layout Using
No ratings yet
Constraint Based 3 D Object Layout Using
10 pages

AI - Unit 3

Uploaded by

AI - Unit 3

Uploaded by

UNIT-III

Adversarial Search Problems and Intelligent Agent

Chess and tic-tac-toe are examples of a Zero-sum game.

1. Generate the Game Tree: Create a tree of all possible moves.

• Win/Loss Games (e.g., Chess, Tic-Tac-Toe)

For node B= min(4,6) = 4

Best Strategy: A → B → D → I ✅ Winning Score: 4

1. What is the Minimax value of the game?

Player 2 [7] [233] [233] [5] [233] [5] [5] [1]

Pick (233) Pick (233) Pick (233) Pick (5)

The best move for Player 2 is the one that

Player 2 222 234

Player 1 222 222 234 234

Player 2 222 -230 -234 222 -222 234 -230 234

[5, 233, 7] [1, 5, 233]

[233, 7] [5, 233] [5, 233] [1, 5]

[ 7] [233] [233] [5] [233] [5] [5] [1]

Player 1 can with start Picking (7) to win

Condition for Alpha-beta pruning:

1. Generate the Game Tree: Create a tree of all possible moves.

List the horizon nodes pruned by alpha-beta algorithm

List the horizon nodes pruned by alpha-beta algorithm

List the horizon nodes pruned by alpha-beta algorithm

CSPs are highly significant in artificial intelligence for several reasons:

• They model a wide range of real-world problems where decision-making is subject to

It is further divided into two main categories:

The CSP formulation for this problem is:

The algorithm follows these steps:

Steps for applying LCV

How Forward Checking Works:

O+O = R+10 ·C_1

• Different people approach AI with different goals in mind.

• Are you concerned with thinking or behavior?

• When an agent is plunked down in an environment, it generates a sequence

You might also like