Informed Search Strategies: Artificial Intelligence
Informed Search Strategies: Artificial Intelligence
INFORMED SEARCH
STRATEGIES
2
Informed (Heuristic) search strategies
3
What are heuristics?
• Additional knowledge of the problem is imparted to the
search algorithm using heuristics.
• A heuristic is any practical approach to problem solving
sufficient for reaching an immediate goal where an optimal
solution is usually impossible.
• Not guaranteed to be optimal, perfect, logical, or rational
• Speed up the process of finding a satisfactory solution
• Ease the cognitive load of making a decision
4
Heuristics: An example
• Availability heuristics
• What comes to mind quickly seems to be significant
5
Heuristics: An example
• Representativeness heuristics
• E.g., describe the portrait of an old woman who is warm and caring
with a great love of children
6
Informed search strategies
Best-first search
A*
RBFS
SMA*
7
Best-first search
• An instance of the general TREE-SEARCH or GRAPH-
SEARCH algorithm
• A node is selected for expansion based on an evaluation
function, 𝒇(𝒏).
• Node with the lowest 𝑓(𝑛) is expanded first
• The choice of 𝑓 determines the search strategy.
8
Heuristic function
• Most best-first algorithms include a heuristic function 𝒉(𝒏)
as a component of 𝑓.
9
Cost function vs. Heuristic function
n
𝑔 𝑆 =0
S UCS
𝑓 𝑛 = 𝑔(𝑛) G
ℎ 𝐺 =0
State space
10
Greedy Best-First Search
11
Greedy best-first search
• Expand the node that appears to be closest to goal using
𝒇(𝒏) = 𝒉(𝒏)
n
𝑔 𝑆 =0 GBS
S 𝑓 𝑛 = ℎ(𝑛)
G
ℎ 𝐺 =0
State space 12
Straight-line distance heuristic 𝒉𝑺𝑳𝑫
13
Greedy best-first search: An example
hSLD(Arad)
14
Greedy best-first search: An example
15
Greedy best-first search: An example
16
Greedy best-first search: An example
17
Evaluation of Greedy best-first search
• Completeness
• NO – may get stuck forever
• E.g., Iasi → Neamt → Iasi → Neamt → …
• Time complexity
• 𝑂(𝑏𝑚 ) → reduced substantially with a good heuristic
• Space complexity
• 𝑂(𝑏𝑚 ) – keeps all nodes in memory
• Optimality
• NO
Quiz 01: Greedy best-first search
• Work out the order in which states are expanded, as well as the path
returned by graph search. Assume ties resolve in such a way that states
with earlier alphabetical order are expanded first.
19
A* Search
20
A* search
• The most widely known form of best-first search
• Ideas
• Use heuristic to guide search, but not only
• Avoid expanding paths that are already expensive
• Ensure to compute a path with minimum cost
n
𝑔 𝑆 =0 A*
𝑓 =𝑔+ℎ
S
G
ℎ 𝐺 =0
State space
22
A* search example
𝑓 =𝑔+𝑛
23
23
24
25
26
27
28
Evaluation of A* search
• Completeness
• YES if all step costs exceed some finite 𝜖 and if 𝑏 is finite
• (review the condition for completeness of UCS)
• Optimality
• YES – with conditions on heuristic being used
• Time complexity
• Exponential
• Space complexity
• Exponential (keep all nodes in memory)
29
A* is not always optimal...
1 A
1
h=6
S h=7
G
h=0
3
30
Conditions for optimality: Admissibility
31
Admissible heuristics for 8-puzzle
• h(n) = number of misplaced tiles
1 5 1 2 3
2 6 3 ℎ(𝑛) = 6 4 5 6
7 4 8 7 8
State 𝑛 Goal state 𝐺
Estimated cost
n
True cost
G
• Suppose some suboptimal goal 𝐺2 has been generated and is in the frontier.
• Let 𝑛 be an unexpanded node in the frontier such that 𝑛 is on a shortest path to
an optimal goal 𝐺.
• 𝑓 𝐺2 = 𝑔(𝐺2) since ℎ(𝐺2) = 0
• 𝑔 𝐺2 > 𝑔(𝐺) since 𝐺2 is suboptimal
• 𝑓 𝐺 = 𝑔(𝐺) since ℎ(𝐺) = 0
𝑓(𝐺2) > 𝑓(𝐺) (1)
• ℎ(𝑛) ≤ ℎ∗ (𝑛) since h is admissible
• 𝑔 𝑛 + ℎ 𝑛 ≤ 𝑔 𝑛 + ℎ∗ (𝑛)
𝑓(𝑛) ≤ 𝑓(𝐺) (2)
• From (1), (2): 𝑓 𝐺2 > 𝑓(𝑛) → A* will never select 𝐺2 for expansion
34
Conditions for optimality: Consistency
35
Conditions for optimality: Consistency
• If ℎ(𝑛) is consistent, the values of 𝑓(𝑛) along any path are non-
decreasing.
• Suppose 𝑛′ is a successor of 𝑛 → 𝑔 𝑛′ = 𝑔 𝑛 + 𝑐(𝑛, 𝑎, 𝑛′ )
• 𝑓 𝑛′ = 𝑔 𝑛′ + ℎ 𝑛′ = 𝑔 𝑛 + 𝑐(𝑛, 𝑎, 𝑛′ ) + ℎ(𝑛′ ) ≥ 𝑔(𝑛) + ℎ(𝑛) = 𝑓(𝑛)
• Whenever A* selects a node 𝑛 for expansion, the optimal path to that
node has been found.
• Proof by contradiction: There would have to be another frontier node 𝑛′ on
the optimal path from the start node to 𝑛 (by the graph separation property)
• 𝑓 is nondecreasing along any path → 𝑓 𝑛′ < 𝑓(𝑛) → 𝑛′ would have been
selected first
36
Contours of A* search
• A* expands nodes in order of increasing 𝑓-value
• Gradually adds "𝑓-contours" of nodes such that contour
𝑖 has all nodes with 𝑓 = 𝑓𝑖 where 𝑓𝑖 < 𝑓𝑖+1
• A* will expand all nodes with costs 𝑓(𝑛) < 𝐶 ∗
37
A* contours vs. UCS contours
• The bands of UCS will be “circular” around the start state.
• The bands of A*, with more accurate heuristics, will stretch toward the
goal state and become more narrowly focused around the optimal path.
38
Comments on A*: The good
• Never expand nodes with 𝑓 𝑛 > 𝐶 ∗
• All nodes like these are pruned while stile guaranteeing optimality
• Optimally efficient for any given consistent heuristic
• No other optimal algorithm is guaranteed to expand fewer nodes
39
Comments on A*: The bad
• A* expands all nodes with 𝑓(𝑛) < 𝐶 ∗ (and possibly some
nodes with 𝑓 𝑛 = 𝐶 ∗ ) with before selecting a goal node.
• This can still be exponentially large
• A* usually runs out of space before it runs out of time
• Exponential growth will occur unless error in ℎ(𝑛) grows no
faster than log(true path cost)
• In practice, error is usually proportional to true path cost (not log)
• So exponential growth is common
→ Not practical for many large-scale problems
40
Quiz 02: A*
• Work out the order in which states are expanded, as well as the path
returned by graph search. Assume ties resolve in such a way that states
with earlier alphabetical order are expanded first.
41
Memory-Bounded Heuristic Search
42
Memory-bound heuristic search
• In practice, A* usually runs out of space long before it runs
out of time.
• Idea: try something like DFS, but not forget everything about
the branches we have partially explored
43
Iterative-deepening A* (IDA*)
• The main difference with IDS
• Cut-off use the 𝒇-value (𝒈 + 𝒉) rather than the depth
• At each iteration, the cutoff value is the smallest 𝑓-value of any node
that exceeded the cutoff on the previous iteration
• Avoid the substantial overhead associated with keeping a
sorted queue of nodes.
• Practical for many problems with unit step costs, yet difficult
with real valued costs
44
Recursive best-first search (RBFS)
• Keep track of the 𝑓 -value of the best alternative path
available from any ancestor of the current node
→ backtrack when the current node exceeds 𝑓_𝑙𝑖𝑚𝑖𝑡
• As it backtracks, replace 𝑓-value of each node along the
path with the best 𝑓(𝑛) value of its children
45
Recursive best-first search (RBFS)
function RECURSIVE-BEST-FIRST-SEARCH(problem) returns a solution, or failure
return RBFS(problem, MAKE-NODE(problem.INITIAL-STATE),∞)
function RBFS(problem, node, f_limit) returns a solution, or failure and a new 𝑓-cost limit
if problem.GOAL-TEST(node.STATE) then return SOLUTION(node)
successors ← [ ]
for each action in problem.ACTIONS(node.STATE) do
add CHILD-NODE(problem, node, action) into successors
if successors is empty then return failure, ∞
for each s in successors do /* update 𝑓 with value from previous search, if any */
s.f ← max(s.g+s.h, node.f))
loop do
best ← the lowest 𝑓-value node in successors
if best.f > f_limit then return failure, best.f
alternative ← the second-lowest f-value among successors
result, best.f ← RBFS(problem, best, min(f_limit, alternative))
46
if result failure then return result
RBFS: An example
𝑓_𝑙𝑖𝑚𝑖𝑡 for every
recursive call
𝑓(𝑛)
47
Recursive best-first search (RBFS)
• Unwind recursion and store best 𝑓-value for current best leaf
Rimnicu Vilcea
• result, best.f ← RBFS(problem, best, min(f_limit, alternative))
• best is now Fagaras. Call RBFS for new best
• best value is now 450
48
• Unwind recursion and store best 𝑓-value for current best leaf of Fagaras
• result, best.f ← RBFS(problem, best, min(f_limit, alternative))
• best is now Rimnicu Viclea (again). Call RBFS for new best
• Subtree is again expanded
• Best alternative subtree is now through Timisoara
• Solution is found since because 447 > 418.
49
Evaluation of RBFS
• Optimality
• Like A*, optimal if ℎ(𝑛) is admissible
• Time complexity
• Difficult to characterize
• Depends on accuracy of ℎ(𝑛) and how often best path changes
• Can end up “switching” back and forth
• Space complexity
• Linear time: O(bd)
• Other extreme to A* - uses too little memory even if more memory
were available
50
(Simplified) Memory-bound A* – (S)MA*
• Like A*, but delete the worst node (largest f-value) when
memory is full
• SMA* also backs up the value of the forgotten node to its
parent.
• If there is a tie (equal 𝑓-values), delete the oldest nodes first
• Simplified-MA* finds the optimal reachable solution given the
memory constraint.
• The depth of the shallowest goal node is less than the memory size
(expressed in nodes)
• Time can still be exponential.
51
Learning to search better
• Could an agent learn how to search better? YES
• Metalevel state space: in which each state captures the
internal (computational) state of a program that is searching
in an object-level state space.
• For example, the map of Romania problem,
• The internal state of the A∗ algorithm is the current search tree.
• Each action in the metalevel state space is a computation step that
alters the internal state, e.g., [expands a leaf node and adds its
successors to the tree]
52
Learning to search better
• The expansion of Fagaras is not helpful → harder problems
may even include more such missteps
53
54
The 8-puzzle problem
55
Admissible heuristics for 8-puzzle
• h(n) = number of misplaced tiles
7 2 4 1 2
5 6 ℎ(𝑛) = 6 3 4 5
8 3 1 6 7 8
State 𝑛 Goal state 𝐺
57
The effect of heuristic on performance
Comparison of the search costs and effective branching factors for the ITERATIVE-
DEEPENING-SEARCH and A∗ algorithms with ℎ1 , ℎ2 . Data are averaged over 100
instances of the 8-puzzle for each of various solution lengths 𝑑.
59
Heuristic dominance
• Given two admissible heuristics, ℎ1 and ℎ2
• If ℎ2 (𝑛) ≥ ℎ1 (𝑛), for all 𝑛, then ℎ2 dominates ℎ1
• A* using ℎ2 will never expand more nodes than A* using ℎ1
• Better to use a heuristic function with higher values, provided
it is consistent and its computation time is not too long.
60
Relaxed problems
• Problems with fewer restrictions on the actions
n
Relaxed problem state-space
64
Pattern databases
• Admissible heuristics can also be derived from the solution
cost of a subproblem of a given problem.
• This cost is a lower bound on the cost of the complete problem.
• Pattern databases (PDB): store the exact solution costs for
every possible subproblem instance
• E.g., every possible configuration of the four tiles and the blank
• The complete heuristic is constructed using the patterns in
the databases.
65
Heuristic from Pattern databases
https://courses.cs.washington.edu/courses/cse473/12sp/slides/04-heuristics.pdf
66
Heuristic from Pattern databases
https://courses.cs.washington.edu/courses/cse473/12sp/slides/04-heuristics.pdf
67
Additive pattern databases
• Limitation of traditional PDB: Take max → diminish returns
on additional DBs
• Disjoint pattern databases: Count only moves of the pattern
tiles, ignoring non-pattern moves.
• If no tile belongs to more than one pattern, add their heuristic values.
68
Additive pattern databases
69
Performance of PDB
• 15 Puzzle
• 2000 speedup vs. Manhattan distance
• IDA* with the two DBs solves 15-puzzles optimally in 30 milliseconds
• 24 Puzzle
• 12 million speedup vs. Manhattan
• IDA* can solve random instances in 2 days.
• Requires 4 DBs as shown
• Each DB has 128 million entries
• Without PDBs: 65,000 years
70
Learning heuristics from experience
71
THE END
72