@vtucode - in BCS515B Module 5 Textbook
@vtucode - in BCS515B Module 5 Textbook
The fact Magic(West) is also added to the KB. In this way, even if the knowledge base
contains facts about millions of Americans, only Colonel West will be considered during the
forward inference process. The complete process for defining magic sets and rewriting the
knowledge base is too complex to go into here, but the basic idea is to perform a sort of
“generic” backward inference from the goal in order to work out which variable bindings
need to be constrained. The magic sets approach can therefore be thought of as a kind of
hybrid between forward inference and backward preprocessing.
The second major family of logical inference algorithms uses the backward chaining ap-
proach introduced in Section 7.5 for definite clauses. These algorithms work backward from
the goal, chaining through rules to find known facts that support the proof. We describe
the basic algorithm, and then we describe how it is used in logic programming, which is the
most widely used form of automated reasoning. We also see that backward chaining has some
disadvantages compared with forward chaining, and we look at ways to overcome them. Fi-
nally, we look at the close connection between logic programming and constraint satisfaction
problems.
Criminal(West)
Figure 9.7 Proof tree constructed by backward chaining to prove that West is a criminal.
The tree should be read depth first, left to right. To prove Criminal (West ), we have to prove
the four conjuncts below it. Some of these are in the knowledge base, and others require
further backward chaining. Bindings for each successful unification are shown next to the
corresponding subgoal. Note that once one subgoal in a conjunction succeeds, its substitution
is applied to subsequent subgoals. Thus, by the time FOL-BC-A SK gets to the last conjunct,
originally Hostile(z), z is already bound to Nono.
Section 9.4. Backward Chaining 339
goal “X is 4+3” succeeds with X bound to 7. On the other hand, the goal “5 is X+Y”
fails, because the built-in functions do not do arbitrary equation solving.5
• There are built-in predicates that have side effects when executed. These include input–
output predicates and the assert/retract predicates for modifying the knowledge
base. Such predicates have no counterpart in logic and can produce confusing results—
for example, if facts are asserted in a branch of the proof tree that eventually fails.
• The occur check is omitted from Prolog’s unification algorithm. This means that some
unsound inferences can be made; these are almost never a problem in practice.
• Prolog uses depth-first backward-chaining search with no checks for infinite recursion.
This makes it very fast when given the right set of axioms, but incomplete when given
the wrong ones.
Prolog’s design represents a compromise between declarativeness and execution efficiency—
inasmuch as efficiency was understood at the time Prolog was designed.
Figure 9.8 Pseudocode representing the result of compiling the Append predicate. The
function N EW-VARIABLE returns a new variable, distinct from all other variables used so far.
The procedure C ALL(continuation) continues execution with the specified continuation.
program, on the other hand, is an inference procedure for a specific set of clauses, so it knows
what clauses match the goal. Prolog basically generates a miniature theorem prover for each
different predicate, thereby eliminating much of the overhead of interpretation. It is also pos-
OPEN-CODE sible to open-code the unification routine for each different call, thereby avoiding explicit
analysis of term structure. (For details of open-coded unification, see Warren et al. (1977).)
The instruction sets of today’s computers give a poor match with Prolog’s semantics,
so most Prolog compilers compile into an intermediate language rather than directly into ma-
chine language. The most popular intermediate language is the Warren Abstract Machine,
or WAM, named after David H. D. Warren, one of the implementers of the first Prolog com-
piler. The WAM is an abstract instruction set that is suitable for Prolog and can be either
interpreted or translated into machine language. Other compilers translate Prolog into a high-
level language such as Lisp or C and then use that language’s compiler to translate to machine
language. For example, the definition of the Append predicate can be compiled into the code
shown in Figure 9.8. Several points are worth mentioning:
• Rather than having to search the knowledge base for Append clauses, the clauses be-
come a procedure and the inferences are carried out simply by calling the procedure.
• As described earlier, the current variable bindings are kept on a trail. The first step of the
procedure saves the current state of the trail, so that it can be restored by R ESET-T RAIL
if the first clause fails. This will undo any bindings generated by the first call to U NIFY .
CONTINUATION • The trickiest part is the use of continuations to implement choice points. You can think
of a continuation as packaging up a procedure and a list of arguments that together
define what should be done next whenever the current goal succeeds. It would not
do just to return from a procedure like A PPEND when the goal succeeds, because it
could succeed in several ways, and each of them has to be explored. The continuation
argument solves this problem because it can be called each time the goal succeeds. In
the A PPEND code, if the first argument is empty and the second argument unifies with
the third, then the A PPEND predicate has succeeded. We then C ALL the continuation,
with the appropriate bindings on the trail, to do whatever should be done next. For
example, if the call to A PPEND were at the top level, the continuation would print the
bindings of the variables.
342 Chapter 9. Inference in First-Order Logic
Before Warren’s work on the compilation of inference in Prolog, logic programming was
too slow for general use. Compilers by Warren and others allowed Prolog code to achieve
speeds that are competitive with C on a variety of standard benchmarks (Van Roy, 1990).
Of course, the fact that one can write a planner or natural language parser in a few dozen
lines of Prolog makes it somewhat more desirable than C for prototyping most small-scale AI
research projects.
Parallelization can also provide substantial speedup. There are two principal sources of
OR-PARALLELISM parallelism. The first, called OR-parallelism, comes from the possibility of a goal unifying
with many different clauses in the knowledge base. Each gives rise to an independent branch
in the search space that can lead to a potential solution, and all such branches can be solved
AND-PARALLELISM in parallel. The second, called AND-parallelism, comes from the possibility of solving
each conjunct in the body of an implication in parallel. AND-parallelism is more difficult to
achieve, because solutions for the whole conjunction require consistent bindings for all the
variables. Each conjunctive branch must communicate with the other branches to ensure a
global solution.
A1
A B C
J4
(a) (b)
Figure 9.9 (a) Finding a path from A to C can lead Prolog into an infinite loop. (b) A
graph in which each node is connected to two random successors in the next layer. Finding a
path from A1 to J4 requires 877 inferences.
path(a,c)
path(a,c)
path(a,Y) link(Y,c)
link(a,c) path(a,Y) link(b,c)
fail {}
path(a,Y’) link(Y’,Y)
link(a,Y)
{ Y / b}
(a) (b)
Figure 9.10 (a) Proof that a path exists from A to C. (b) Infinite proof tree generated
when the clauses are in the “wrong” order.
subproblems and are cached to avoid recomputation. We can obtain a similar effect in a
backward chaining system using memoization—that is, caching solutions to subgoals as
they are found and then reusing those solutions when the subgoal recurs, rather than repeat-
TABLED LOGIC
PROGRAMMING ing the previous computation. This is the approach taken by tabled logic programming sys-
tems, which use efficient storage and retrieval mechanisms to perform memoization. Tabled
logic programming combines the goal-directedness of backward chaining with the dynamic-
programming efficiency of forward chaining. It is also complete for Datalog knowledge
bases, which means that the programmer need worry less about infinite loops. (It is still pos-
sible to get an infinite loop with predicates like father(X,Y) that refer to a potentially
unbounded number of objects.)
by the knowledge base. There is no way to assert that a sentence is false in Prolog. This makes
Prolog less expressive than first-order logic, but it is part of what makes Prolog more efficient
and more concise. Consider the following Prolog assertions about some course offerings:
Course(CS , 101), Course(CS , 102), Course(CS , 106), Course(EE , 101). (9.11)
Under the unique names assumption, CS and EE are different (as are 101, 102, and 106),
so this means that there are four distinct courses. Under the closed-world assumption there
are no other courses, so there are exactly four courses. But if these were assertions in FOL
rather than in Prolog, then all we could say is that there are somewhere between one and
infinity courses. That’s because the assertions (in FOL) do not deny the possibility that other
unmentioned courses are also offered, nor do they say that the courses mentioned are different
from each other. If we wanted to translate Equation (9.11) into FOL, we would get this:
Course(d, n) ⇔ (d = CS ∧ n = 101) ∨ (d = CS ∧ n = 102)
∨ (d = CS ∧ n = 106) ∨ (d = EE ∧ n = 101) . (9.12)
COMPLETION This is called the completion of Equation (9.11). It expresses in FOL the idea that there are
at most four courses. To express in FOL the idea that there are at least four courses, we need
to write the completion of the equality predicate:
x=y ⇔ (x = CS ∧ y = CS ) ∨ (x = EE ∧ y = EE ) ∨ (x = 101 ∧ y = 101)
∨ (x = 102 ∧ y = 102) ∨ (x = 106 ∧ y = 106) .
The completion is useful for understanding database semantics, but for practical purposes, if
your problem can be described with database semantics, it is more efficient to reason with
Prolog or some other database semantics system, rather than translating into FOL and rea-
soning with a full FOL theorem prover.
CONSTRAINT LOGIC
PROGRAMMING Constraint logic programming (CLP) allows variables to be constrained rather than
bound. A CLP solution is the most specific set of constraints on the query variables that can
be derived from the knowledge base. For example, the solution to the triangle(3,4,Z)
query is the constraint 7 >= Z >= 1. Standard logic programs are just a special case of
CLP in which the solution constraints must be equality constraints—that is, bindings.
CLP systems incorporate various constraint-solving algorithms for the constraints al-
lowed in the language. For example, a system that allows linear inequalities on real-valued
variables might include a linear programming algorithm for solving those constraints. CLP
systems also adopt a much more flexible approach to solving standard logic programming
queries. For example, instead of depth-first, left-to-right backtracking, they might use any of
the more efficient algorithms discussed in Chapter 6, including heuristic conjunct ordering,
backjumping, cutset conditioning, and so on. CLP systems therefore combine elements of
constraint satisfaction algorithms, logic programming, and deductive databases.
Several systems that allow the programmer more control over the search order for in-
ference have been defined. The MRS language (Genesereth and Smith, 1981; Russell, 1985)
METARULE allows the programmer to write metarules to determine which conjuncts are tried first. The
user could write a rule saying that the goal with the fewest variables should be tried first or
could write domain-specific rules for particular predicates.
9.5 R ESOLUTION
The last of our three families of logical systems is based on resolution. We saw on page 250
that propositional resolution using refutation is a complete inference procedure for proposi-
tional logic. In this section, we describe how to extend resolution to first-order logic.
of atoms in the conclusion (Exercise 7.13). This is called implicative normal form or Kowalski form (especially
when written with a right-to-left implication symbol (Kowalski, 1979)) and is often much easier to read.
346 Chapter 9. Inference in First-Order Logic
The procedure for conversion to CNF is similar to the propositional case, which we saw
on page 253. The principal difference arises from the need to eliminate existential quantifiers.
We illustrate the procedure by translating the sentence “Everyone who loves all animals is
loved by someone,” or
∀ x [∀ y Animal(y) ⇒ Loves(x, y)] ⇒ [∃ y Loves(y, x)] .
The steps are as follows:
• Eliminate implications:
∀ x [¬∀ y ¬Animal(y) ∨ Loves(x, y)] ∨ [∃ y Loves(y, x)] .
• Move ¬ inwards: In addition to the usual rules for negated connectives, we need rules
for negated quantifiers. Thus, we have
¬∀ x p becomes ∃ x ¬p
¬∃ x p becomes ∀ x ¬p .
Our sentence goes through the following transformations:
∀ x [∃ y ¬(¬Animal(y) ∨ Loves(x, y))] ∨ [∃ y Loves(y, x)] .
∀ x [∃ y ¬¬Animal(y) ∧ ¬Loves(x, y)] ∨ [∃ y Loves(y, x)] .
∀ x [∃ y Animal (y) ∧ ¬Loves(x, y)] ∨ [∃ y Loves(y, x)] .
Notice how a universal quantifier (∀ y) in the premise of the implication has become
an existential quantifier. The sentence now reads “Either there is some animal that x
doesn’t love, or (if this is not the case) someone loves x.” Clearly, the meaning of the
original sentence has been preserved.
• Standardize variables: For sentences like (∃ x P (x)) ∨ (∃ x Q(x)) which use the same
variable name twice, change the name of one of the variables. This avoids confusion
later when we drop the quantifiers. Thus, we have
∀ x [∃ y Animal (y) ∧ ¬Loves(x, y)] ∨ [∃ z Loves(z, x)] .
SKOLEMIZATION • Skolemize: Skolemization is the process of removing existential quantifiers by elimi-
nation. In the simple case, it is just like the Existential Instantiation rule of Section 9.1:
translate ∃ x P (x) into P (A), where A is a new constant. However, we can’t apply Ex-
istential Instantiation to our sentence above because it doesn’t match the pattern ∃ v α;
only parts of the sentence match the pattern. If we blindly apply the rule to the two
matching parts we get
∀ x [Animal (A) ∧ ¬Loves(x, A)] ∨ Loves(B, x) ,
which has the wrong meaning entirely: it says that everyone either fails to love a par-
ticular animal A or is loved by some particular entity B. In fact, our original sentence
allows each person to fail to love a different animal or to be loved by a different person.
Thus, we want the Skolem entities to depend on x and z:
∀ x [Animal (F (x)) ∧ ¬Loves(x, F (x))] ∨ Loves(G(z), x) .
SKOLEM FUNCTION Here F and G are Skolem functions. The general rule is that the arguments of the
Skolem function are all the universally quantified variables in whose scope the exis-
tential quantifier appears. As with Existential Instantiation, the Skolemized sentence is
satisfiable exactly when the original sentence is satisfiable.
Section 9.5. Resolution 347
• Drop universal quantifiers: At this point, all remaining variables must be universally
quantified. Moreover, the sentence is equivalent to one in which all the universal quan-
tifiers have been moved to the left. We can therefore drop the universal quantifiers:
[Animal (F (x)) ∧ ¬Loves(x, F (x))] ∨ Loves(G(z), x) .
• Distribute ∨ over ∧:
[Animal (F (x)) ∨ Loves(G(z), x)] ∧ [¬Loves(x, F (x)) ∨ Loves(G(z), x)] .
This step may also require flattening out nested conjunctions and disjunctions.
The sentence is now in CNF and consists of two clauses. It is quite unreadable. (It may
help to explain that the Skolem function F (x) refers to the animal potentially unloved by x,
whereas G(z) refers to someone who might love x.) Fortunately, humans seldom need look
at CNF sentences—the translation process is easily automated.
^ ^ ^ ^
¬American(x) ¬Weapon(y) ¬Sells(x,y,z) ¬Hostile(z) Criminal(x) ¬Criminal(West)
^ ^ ^
American(West) ¬American(West) ¬Weapon(y) ¬Sells(West,y,z) ¬Hostile(z)
^ ^ ^
¬Missile(x) Weapon(x) ¬Weapon(y) ¬Sells(West,y,z) ¬Hostile(z)
^ ^
Missile(M1) ¬Missile(y) ¬Sells(West,y,z) ¬Hostile(z)
^ ^ ^
¬Missile(x) ¬Owns(Nono,x) Sells(West,x, Nono) ¬Sells(West,M1,z) ¬Hostile(z)
^ ^
Missile(M1) ¬Missile(M1) ¬Owns(Nono,M1) ¬Hostile(Nono)
^
Owns(Nono, M1) ¬Owns(Nono,M1) ¬Hostile(Nono)
^
¬Enemy(x,America) Hostile(x) ¬Hostile(Nono)
Figure 9.11 A resolution proof that West is a criminal. At each step, the literals that unify
are in bold.
Figure 7.12, so we need not repeat it here. Instead, we give two example proofs. The first is
the crime example from Section 9.3. The sentences in CNF are
¬American(x) ∨ ¬Weapon(y) ∨ ¬Sells(x, y, z) ∨ ¬Hostile(z) ∨ Criminal (x)
¬Missile(x) ∨ ¬Owns(Nono, x) ∨ Sells(West, x, Nono)
¬Enemy(x, America) ∨ Hostile(x)
¬Missile(x) ∨ Weapon(x)
Owns(Nono, M1 ) Missile(M1 )
American(West) Enemy(Nono, America) .
We also include the negated goal ¬Criminal (West). The resolution proof is shown in Fig-
ure 9.11. Notice the structure: single “spine” beginning with the goal clause, resolving against
clauses from the knowledge base until the empty clause is generated. This is characteristic
of resolution on Horn clause knowledge bases. In fact, the clauses along the main spine
correspond exactly to the consecutive values of the goals variable in the backward-chaining
algorithm of Figure 9.6. This is because we always choose to resolve with a clause whose
positive literal unified with the leftmost literal of the “current” clause on the spine; this is
exactly what happens in backward chaining. Thus, backward chaining is just a special case
of resolution with a particular control strategy to decide which resolution to perform next.
Our second example makes use of Skolemization and involves clauses that are not def-
inite clauses. This results in a somewhat more complex proof structure. In English, the
problem is as follows:
Everyone who loves all animals is loved by someone.
Anyone who kills an animal is loved by no one.
Jack loves all animals.
Either Jack or Curiosity killed the cat, who is named Tuna.
Did Curiosity kill the cat?
Section 9.5. Resolution 349
First, we express the original sentences, some background knowledge, and the negated goal
G in first-order logic:
A. ∀ x [∀ y Animal (y) ⇒ Loves(x, y)] ⇒ [∃ y Loves(y, x)]
B. ∀ x [∃ z Animal (z) ∧ Kills(x, z)] ⇒ [∀ y ¬Loves(y, x)]
C. ∀ x Animal(x) ⇒ Loves(Jack , x)
D. Kills(Jack , Tuna) ∨ Kills(Curiosity, Tuna)
E. Cat(Tuna)
F. ∀ x Cat(x) ⇒ Animal (x)
¬G. ¬Kills(Curiosity, Tuna)
Now we apply the conversion procedure to convert each sentence to CNF:
A1. Animal(F (x)) ∨ Loves(G(x), x)
A2. ¬Loves(x, F (x)) ∨ Loves(G(x), x)
B. ¬Loves(y, x) ∨ ¬Animal (z) ∨ ¬Kills(x, z)
C. ¬Animal(x) ∨ Loves(Jack , x)
D. Kills(Jack , Tuna) ∨ Kills(Curiosity, Tuna)
E. Cat(Tuna)
F. ¬Cat(x) ∨ Animal (x)
¬G. ¬Kills(Curiosity, Tuna)
The resolution proof that Curiosity killed the cat is given in Figure 9.12. In English, the proof
could be paraphrased as follows:
Suppose Curiosity did not kill Tuna. We know that either Jack or Curiosity did; thus
Jack must have. Now, Tuna is a cat and cats are animals, so Tuna is an animal. Because
anyone who kills an animal is loved by no one, we know that no one loves Jack. On the
other hand, Jack loves all animals, so someone loves him; so we have a contradiction.
Therefore, Curiosity killed the cat.
^ ^
Cat(Tuna) ¬Cat(x) Animal(x) Kills(Jack, Tuna) Kills(Curiosity, Tuna) ¬Kills(Curiosity, Tuna)
^ ^ ^ ^
Animal(Tuna) ¬Loves(y, x) ¬Animal(z) ¬Kills(x, z) Kills(Jack, Tuna) ¬Loves(x, F(x)) Loves(G(x), x) ¬Animal(x) Loves(Jack, x)
^ ^ ^
¬Loves(y, x) ¬Kills(x, Tuna) ¬Animal(F(Jack)) Loves(G(Jack), Jack) Animal(F(x)) Loves(G(x), x)
Figure 9.12 A resolution proof that Curiosity killed the cat. Notice the use of factoring
in the derivation of the clause Loves(G(Jack ), Jack ). Notice also in the upper right, the
unification of Loves(x, F (x)) and Loves(Jack, x) can only succeed after the variables have
been standardized apart.
350 Chapter 9. Inference in First-Order Logic
The proof answers the question “Did Curiosity kill the cat?” but often we want to pose more
general questions, such as “Who killed the cat?” Resolution can do this, but it takes a little
more work to obtain the answer. The goal is ∃ w Kills(w, Tuna), which, when negated,
becomes ¬Kills(w, Tuna) in CNF. Repeating the proof in Figure 9.12 with the new negated
goal, we obtain a similar proof tree, but with the substitution {w/Curiosity } in one of the
steps. So, in this case, finding out who killed the cat is just a matter of keeping track of the
bindings for the query variables in the proof.
NONCONSTRUCTIVE
PROOF Unfortunately, resolution can produce nonconstructive proofs for existential goals.
For example, ¬Kills(w, Tuna) resolves with Kills(Jack , Tuna) ∨ Kills(Curiosity , Tuna)
to give Kills(Jack , Tuna), which resolves again with ¬Kills(w, Tuna) to yield the empty
clause. Notice that w has two different bindings in this proof; resolution is telling us that,
yes, someone killed Tuna—either Jack or Curiosity. This is no great surprise! One so-
lution is to restrict the allowed resolution steps so that the query variables can be bound
only once in a given proof; then we need to be able to backtrack over the possible bind-
ANSWER LITERAL ings. Another solution is to add a special answer literal to the negated goal, which be-
comes ¬Kills(w, Tuna) ∨ Answer (w). Now, the resolution process generates an answer
whenever a clause is generated containing just a single answer literal. For the proof in Fig-
ure 9.12, this is Answer(Curiosity ). The nonconstructive proof would generate the clause
Answer (Curiosity) ∨ Answer(Jack ), which does not constitute an answer.
Herbrand’s theorem
Some set S' of ground instances is unsatisfiable
Ground resolution
theorem
Resolution can find a contradiction in S'
Lifting lemma
There is a resolution proof for the contradiction in S'
By slightly extending the language of first-order logic to allow for the mathemat-
ical induction schema in arithmetic, Kurt Gödel was able to show, in his incom-
pleteness theorem, that there are true arithmetic sentences that cannot be proved.
The proof of the incompleteness theorem is somewhat beyond the scope of
this book, occupying, as it does, at least 30 pages, but we can give a hint here. We
begin with the logical theory of numbers. In this theory, there is a single constant,
0, and a single function, S (the successor function). In the intended model, S(0)
denotes 1, S(S(0)) denotes 2, and so on; the language therefore has names for all
the natural numbers. The vocabulary also includes the function symbols +, ×, and
Expt (exponentiation) and the usual set of logical connectives and quantifiers. The
first step is to notice that the set of sentences that we can write in this language can
be enumerated. (Imagine defining an alphabetical order on the symbols and then
arranging, in alphabetical order, each of the sets of sentences of length 1, 2, and
so on.) We can then number each sentence α with a unique natural number #α
(the Gödel number). This is crucial: number theory contains a name for each of
its own sentences. Similarly, we can number each possible proof P with a Gödel
number G(P ), because a proof is simply a finite sequence of sentences.
Now suppose we have a recursively enumerable set A of sentences that are
true statements about the natural numbers. Recalling that A can be named by a
given set of integers, we can imagine writing in our language a sentence α(j, A) of
the following sort:
∀ i i is not the Gödel number of a proof of the sentence whose Gödel
number is j, where the proof uses only premises in A.
Then let σ be the sentence α(#σ, A), that is, a sentence that states its own unprov-
ability from A. (That this sentence always exists is true but not entirely obvious.)
Now we make the following ingenious argument: Suppose that σ is provable
from A; then σ is false (because σ says it cannot be proved). But then we have a
false sentence that is provable from A, so A cannot consist of only true sentences—
a violation of our premise. Therefore, σ is not provable from A. But this is exactly
what σ itself claims; hence σ is a true sentence.
So, we have shown (barring 29 21 pages) that for any set of true sentences of
number theory, and in particular any set of basic axioms, there are other true sen-
tences that cannot be proved from those axioms. This establishes, among other
things, that we can never prove all the theorems of mathematics within any given
system of axioms. Clearly, this was an important discovery for mathematics. Its
significance for AI has been widely debated, beginning with speculations by Gödel
himself. We take up the debate in Chapter 26.
Section 9.5. Resolution 353
proof using the clauses of S itself, which are not necessarily ground clauses. We start by
considering a single application of the resolution rule. Robinson stated this lemma:
Let C1 and C2 be two clauses with no shared variables, and let C1 and C2 be
ground instances of C1 and C2 . If C is a resolvent of C1 and C2 , then there exists
a clause C such that (1) C is a resolvent of C1 and C2 and (2) C is a ground
instance of C.
LIFTING LEMMA This is called a lifting lemma, because it lifts a proof step from ground clauses up to general
first-order clauses. In order to prove his basic lifting lemma, Robinson had to invent unifi-
cation and derive all of the properties of most general unifiers. Rather than repeat the proof
here, we simply illustrate the lemma:
C1 = ¬P (x, F (x, A)) ∨ ¬Q(x, A) ∨ R(x, B)
C2 = ¬N (G(y), z) ∨ P (H(y), z)
C1 = ¬P (H(B), F (H(B), A)) ∨ ¬Q(H(B), A) ∨ R(H(B), B)
C2 = ¬N (G(B), F (H(B), A)) ∨ P (H(B), F (H(B), A))
C = ¬N (G(B), F (H(B), A)) ∨ ¬Q(H(B), A) ∨ R(H(B), B)
C = ¬N (G(y), F (H(y), A)) ∨ ¬Q(H(y), A) ∨ R(H(y), B) .
We see that indeed C is a ground instance of C. In general, for C1 and C2 to have any
resolvents, they must be constructed by first applying to C1 and C2 the most general unifier
of a pair of complementary literals in C1 and C2 . From the lifting lemma, it is easy to derive
a similar statement about any sequence of applications of the resolution rule:
For any clause C in the resolution closure of S there is a clause C in the resolu-
tion closure of S such that C is a ground instance of C and the derivation of C is
the same length as the derivation of C .
From this fact, it follows that if the empty clause appears in the resolution closure of S , it
must also appear in the resolution closure of S. This is because the empty clause cannot be a
ground instance of any other clause. To recap: we have shown that if S is unsatisfiable, then
there is a finite derivation of the empty clause using the resolution rule.
The lifting of theorem proving from ground clauses to first-order clauses provides a vast
increase in power. This increase comes from the fact that the first-order proof need instantiate
variables only as far as necessary for the proof, whereas the ground-clause methods were
required to examine a huge number of arbitrary instantiations.
9.5.5 Equality
None of the inference methods described so far in this chapter handle an assertion of the form
x = y. Three distinct approaches can be taken. The first approach is to axiomatize equality—
to write down sentences about the equality relation in the knowledge base. We need to say that
equality is reflexive, symmetric, and transitive, and we also have to say that we can substitute
equals for equals in any predicate or function. So we need three basic axioms, and then one
354 Chapter 9. Inference in First-Order Logic
we have θ = U NIFY (F (A, y), F (x, B)) = {x/A, y/B}, and we can conclude by paramodu-
lation the sentence
P (B, A) ∨ Q(A) ∨ R(B) .
Paramodulation yields a complete inference procedure for first-order logic with equality.
A third approach handles equality reasoning entirely within an extended unification
algorithm. That is, terms are unifiable if they are provably equal under some substitution,
where “provably” allows for equality reasoning. For example, the terms 1 + 2 and 2 + 1
normally are not unifiable, but a unification algorithm that knows that x + y = y + x could
EQUATIONAL
UNIFICATION unify them with the empty substitution. Equational unification of this kind can be done with
efficient algorithms designed for the particular axioms used (commutativity, associativity, and
so on) rather than through explicit inference with those axioms. Theorem provers using this
technique are closely related to the CLP systems described in Section 9.4.
original knowledge base is consistent. (After all, if it is not consistent, then the fact that the
query follows from it is vacuous.) The set-of-support strategy has the additional advantage of
generating goal-directed proof trees that are often easy for humans to understand.
INPUT RESOLUTION Input resolution: In this strategy, every resolution combines one of the input sentences (from
the KB or the query) with some other sentence. The proof in Figure 9.11 on page 348 uses
only input resolutions and has the characteristic shape of a single “spine” with single sen-
tences combining onto the spine. Clearly, the space of proof trees of this shape is smaller
than the space of all proof graphs. In Horn knowledge bases, Modus Ponens is a kind of
input resolution strategy, because it combines an implication from the original KB with some
other sentences. Thus, it is no surprise that input resolution is complete for knowledge bases
LINEAR RESOLUTION that are in Horn form, but incomplete in the general case. The linear resolution strategy is a
slight generalization that allows P and Q to be resolved together either if P is in the original
KB or if P is an ancestor of Q in the proof tree. Linear resolution is complete.
SUBSUMPTION Subsumption: The subsumption method eliminates all sentences that are subsumed by (that
is, more specific than) an existing sentence in the KB. For example, if P (x) is in the KB, then
there is no sense in adding P (A) and even less sense in adding P (A) ∨ Q(B). Subsumption
helps keep the KB small and thus helps keep the search space small.
In which we see how an agent can take advantage of the structure of a problem to
construct complex plans of action.
We have defined AI as the study of rational action, which means that planning—devising a
plan of action to achieve one’s goals—is a critical part of AI. We have seen two examples
of planning agents so far: the search-based problem-solving agent of Chapter 3 and the hy-
brid logical agent of Chapter 7. In this chapter we introduce a representation for planning
problems that scales up to problems that could not be handled by those earlier approaches.
Section 10.1 develops an expressive yet carefully constrained language for representing
planning problems. Section 10.2 shows how forward and backward search algorithms can
take advantage of this representation, primarily through accurate heuristics that can be derived
automatically from the structure of the representation. (This is analogous to the way in which
effective domain-independent heuristics were constructed for constraint satisfaction problems
in Chapter 6.) Section 10.3 shows how a data structure called the planning graph can make the
search for a plan more efficient. We then describe a few of the other approaches to planning,
and conclude by comparing the various approaches.
This chapter covers fully observable, deterministic, static environments with single
agents. Chapters 11 and 17 cover partially observable, stochastic, dynamic environments
with multiple agents.
The problem-solving agent of Chapter 3 can find sequences of actions that result in a goal
state. But it deals with atomic representations of states and thus needs good domain-specific
heuristics to perform well. The hybrid propositional logical agent of Chapter 7 can find plans
without domain-specific heuristics because it uses domain-independent heuristics based on
the logical structure of the problem. But it relies on ground (variable-free) propositional
inference, which means that it may be swamped when there are many actions and states. For
example, in the wumpus world, the simple action of moving a step forward had to be repeated
for all four agent orientations, T time steps, and n2 current locations.
366
Section 10.1. Definition of Classical Planning 367
The schema consists of the action name, a list of all the variables used in the schema, a
PRECONDITION precondition and an effect. Although we haven’t said yet how the action schema converts
EFFECT into logical sentences, think of the variables as being universally quantified. We are free to
choose whatever values we want to instantiate the variables. For example, here is one ground
1 PDDL was derived from the original S TRIPS planning language(Fikes and Nilsson, 1971). which is slightly
more restricted than PDDL: S TRIPS preconditions and goals cannot contain negative literals.
368 Chapter 10. Classical Planning
action that results from substituting values for all the variables:
Action(Fly(P1 , SFO , JFK ),
P RECOND :At(P1 , SFO) ∧ Plane(P1 ) ∧ Airport (SFO) ∧ Airport (JFK )
E FFECT:¬At(P1 , SFO ) ∧ At(P1 , JFK ))
The precondition and effect of an action are each conjunctions of literals (positive or negated
atomic sentences). The precondition defines the states in which the action can be executed,
and the effect defines the result of executing the action. An action a can be executed in state
s if s entails the precondition of a. Entailment can also be expressed with the set semantics:
s |= q iff every positive literal in q is in s and every negated literal in q is not. In formal
notation we say
(a ∈ ACTIONS (s)) ⇔ s |= P RECOND (a) ,
where any variables in a are universally quantified. For example,
∀ p, from, to (Fly(p, from, to) ∈ ACTIONS (s)) ⇔
s |= (At(p, from) ∧ Plane(p) ∧ Airport (from) ∧ Airport (to))
APPLICABLE We say that action a is applicable in state s if the preconditions are satisfied by s. When
an action schema a contains variables, it may have multiple applicable instantiations. For
example, with the initial state defined in Figure 10.1, the Fly action can be instantiated as
Fly(P1 , SFO , JFK ) or as Fly(P2 , JFK , SFO), both of which are applicable in the initial
state. If an action a has v variables, then, in a domain with k unique names of objects, it takes
O(v k ) time in the worst case to find the applicable ground actions.
PROPOSITIONALIZE Sometimes we want to propositionalize a PDDL problem—replace each action schema
with a set of ground actions and then use a propositional solver such as SATP LAN to find a
solution. However, this is impractical when v and k are large.
The result of executing action a in state s is defined as a state s which is represented
by the set of fluents formed by starting with s, removing the fluents that appear as negative
DELETE LIST literals in the action’s effects (what we call the delete list or D EL (a)), and adding the fluents
ADD LIST that are positive literals in the action’s effects (what we call the add list or A DD (a)):
R ESULT (s, a) = (s − D EL (a)) ∪ A DD(a) . (10.1)
For example, with the action Fly(P1 , SFO , JFK ), we would remove At(P1 , SFO) and add
At(P1 , JFK ). It is a requirement of action schemas that any variable in the effect must also
appear in the precondition. That way, when the precondition is matched against the state s,
all the variables will be bound, and R ESULT (s, a) will therefore have only ground atoms. In
other words, ground states are closed under the R ESULT operation.
Also note that the fluents do not explicitly refer to time, as they did in Chapter 7. There
we needed superscripts for time, and successor-state axioms of the form
F t+1 ⇔ ActionCausesF t ∨ (F t ∧ ¬ActionCausesNotF t ) .
In PDDL the times and states are implicit in the action schemas: the precondition always
refers to time t and the effect to time t + 1.
A set of action schemas serves as a definition of a planning domain. A specific problem
within the domain is defined with the addition of an initial state and a goal. The initial
Section 10.1. Definition of Classical Planning 369
INITIAL STATE state is a conjunction of ground atoms. (As with all states, the closed-world assumption is
GOAL used, which means that any atoms that are not mentioned are false.) The goal is just like a
precondition: a conjunction of literals (positive or negative) that may contain variables, such
as At(p, SFO ) ∧ Plane(p). Any variables are treated as existentially quantified, so this goal
is to have any plane at SFO. The problem is solved when we can find a sequence of actions
that end in a state s that entails the goal. For example, the state Rich ∧ Famous ∧ Miserable
entails the goal Rich ∧ Famous, and the state Plane(Plane 1 ) ∧ At(Plane 1 , SFO ) entails
the goal At(p, SFO ) ∧ Plane(p).
Now we have defined planning as a search problem: we have an initial state, an ACTIONS
function, a R ESULT function, and a goal test. We’ll look at some example problems before
investigating efficient search algorithms.
Finally, there is the problem of spurious actions such as Fly(P1 , JFK , JFK ), which should
be a no-op, but which has contradictory effects (according to the definition, the effect would
include At(P1 , JFK ) ∧ ¬At(P1 , JFK )). It is common to ignore such problems, because
they seldom cause incorrect plans to be produced. The correct approach is to add inequality
preconditions saying that the from and to airports must be different; see another example of
this in Figure 10.3.
Figure 10.3 A planning problem in the blocks world: building a three-block tower. One
solution is the sequence [MoveToTable (C, A), Move(B, Table, C), Move(A, Table, B)].
A
C B
B A C
of what other blocks. For example, a goal might be to get block A on B and block B on C
(see Figure 10.4).
We use On(b, x) to indicate that block b is on x, where x is either another block or the
table. The action for moving block b from the top of x to the top of y will be Move(b, x, y).
Now, one of the preconditions on moving b is that no other block be on it. In first-order logic,
this would be ¬∃ x On(x, b) or, alternatively, ∀ x ¬On(x, b). Basic PDDL does not allow
quantifiers, so instead we introduce a predicate Clear (x) that is true when nothing is on x.
(The complete problem description is in Figure 10.3.)
The action Move moves a block b from x to y if both b and y are clear. After the move
is made, b is still clear but y is not. A first attempt at the Move schema is
Action(Move(b, x, y),
P RECOND :On(b, x) ∧ Clear (b) ∧ Clear (y),
E FFECT:On(b, y) ∧ Clear (x) ∧ ¬On(b, x) ∧ ¬Clear (y)) .
Unfortunately, this does not maintain Clear properly when x or y is the table. When x is the
Table, this action has the effect Clear (Table), but the table should not become clear; and
when y = Table, it has the precondition Clear (Table), but the table does not have to be clear
372 Chapter 10. Classical Planning
for us to move a block onto it. To fix this, we do two things. First, we introduce another
action to move a block b from x to the table:
Action(MoveToTable(b, x),
P RECOND :On(b, x) ∧ Clear (b),
E FFECT:On(b, Table) ∧ Clear (x) ∧ ¬On(b, x)) .
Second, we take the interpretation of Clear (x) to be “there is a clear space on x to hold a
block.” Under this interpretation, Clear (Table) will always be true. The only problem is that
nothing prevents the planner from using Move(b, x, Table) instead of MoveToTable(b, x).
We could live with this problem—it will lead to a larger-than-necessary search space, but will
not lead to incorrect answers—or we could introduce the predicate Block and add Block (b) ∧
Block (y) to the precondition of Move.
In this subsection we consider the theoretical complexity of planning and distinguish two
PLANSAT decision problems. PlanSAT is the question of whether there exists any plan that solves a
BOUNDED PLANSAT planning problem. Bounded PlanSAT asks whether there is a solution of length k or less;
this can be used to find an optimal plan.
The first result is that both decision problems are decidable for classical planning. The
proof follows from the fact that the number of states is finite. But if we add function symbols
to the language, then the number of states becomes infinite, and PlanSAT becomes only
semidecidable: an algorithm exists that will terminate with the correct answer for any solvable
problem, but may not terminate on unsolvable problems. The Bounded PlanSAT problem
remains decidable even in the presence of function symbols. For proofs of the assertions in
this section, see Ghallab et al. (2004).
Both PlanSAT and Bounded PlanSAT are in the complexity class PSPACE, a class that
is larger (and hence more difficult) than NP and refers to problems that can be solved by a
deterministic Turing machine with a polynomial amount of space. Even if we make some
rather severe restrictions, the problems remain quite difficult. For example, if we disallow
negative effects, both problems are still NP-hard. However, if we also disallow negative
preconditions, PlanSAT reduces to the class P.
These worst-case results may seem discouraging. We can take solace in the fact that
agents are usually not asked to find plans for arbitrary worst-case problem instances, but
rather are asked for plans in specific domains (such as blocks-world problems with n blocks),
which can be much easier than the theoretical worst case. For many domains (including the
blocks world and the air cargo world), Bounded PlanSAT is NP-complete while PlanSAT is
in P; in other words, optimal planning is usually hard, but sub-optimal planning is sometimes
easy. To do well on easier-than-worst-case problems, we will need good search heuristics.
That’s the true advantage of the classical planning formalism: it has facilitated the develop-
ment of very accurate domain-independent heuristics, whereas systems based on successor-
state axioms in first-order logic have had less success in coming up with good heuristics.
Section 10.2. Algorithms for Planning as State-Space Search 373
Now we turn our attention to planning algorithms. We saw how the description of a planning
problem defines a search problem: we can search from the initial state through the space
of states, looking for a goal. One of the nice advantages of the declarative representation of
action schemas is that we can also search backward from the goal, looking for the initial state.
Figure 10.5 compares forward and backward searches.
At(P1, B)
Fly(P1, A, B) At(P2, A)
At(P1, A)
(a)
At(P2, A)
Fly(P2, A, B) At(P1, A)
At(P2, B)
At(P1, A)
At(P2, B) Fly(P1, A, B)
At(P1, B)
(b)
At(P2, B)
At(P1, B) Fly(P2, A, B)
At(P2, A)
Figure 10.5 Two approaches to searching for a plan. (a) Forward (progression) search
through the space of states, starting in the initial state and using the problem’s actions to
search forward for a member of the set of goal states. (b) Backward (regression) search
through sets of relevant states, starting at the set of states representing the goal and using the
inverse of the actions to search backward for the initial state.
374 Chapter 10. Classical Planning
action schema Buy(isbn) with effect Own(isbn). ISBNs are 10 digits, so this action schema
represents 10 billion ground actions. An uninformed forward-search algorithm would have
to start enumerating these 10 billion actions to find one that leads to the goal.
Second, planning problems often have large state spaces. Consider an air cargo problem
with 10 airports, where each airport has 5 planes and 20 pieces of cargo. The goal is to move
all the cargo at airport A to airport B. There is a simple solution to the problem: load the 20
pieces of cargo into one of the planes at A, fly the plane to B, and unload the cargo. Finding
the solution can be difficult because the average branching factor is huge: each of the 50
planes can fly to 9 other airports, and each of the 200 packages can be either unloaded (if
it is loaded) or loaded into any plane at its airport (if it is unloaded). So in any state there
is a minimum of 450 actions (when all the packages are at airports with no planes) and a
maximum of 10,450 (when all packages and planes are at the same airport). On average, let’s
say there are about 2000 possible actions per state, so the search graph up to the depth of the
obvious solution has about 200041 nodes.
Clearly, even this relatively small problem instance is hopeless without an accurate
heuristic. Although many real-world applications of planning have relied on domain-specific
heuristics, it turns out (as we see in Section 10.2.3) that strong domain-independent heuristics
can be derived automatically; that is what makes forward search feasible.
To get the full advantage of backward search, we need to deal with partially uninstanti-
ated actions and states, not just ground ones. For example, suppose the goal is to deliver a spe-
cific piece of cargo to SFO: At(C2 , SFO ). That suggests the action Unload (C2 , p , SFO ):
Action(Unload (C2 , p , SFO ),
P RECOND :In(C2 , p ) ∧ At(p , SFO ) ∧ Cargo(C2 ) ∧ Plane(p ) ∧ Airport(SFO )
E FFECT:At(C2 , SFO ) ∧ ¬In(C2 , p ) .
(Note that we have standardized variable names (changing p to p in this case) so that there
will be no confusion between variable names if we happen to use the same action schema
twice in a plan. The same approach was used in Chapter 9 for first-order logical inference.)
This represents unloading the package from an unspecified plane at SFO; any plane will do,
but we need not say which one now. We can take advantage of the power of first-order
representations: a single description summarizes the possibility of using any of the planes by
implicitly quantifying over p . The regressed state description is
g = In(C2 , p ) ∧ At(p , SFO) ∧ Cargo(C2 ) ∧ Plane(p ) ∧ Airport (SFO ) .
The final issue is deciding which actions are candidates to regress over. In the forward direc-
tion we chose actions that were applicable—those actions that could be the next step in the
RELEVANCE plan. In backward search we want actions that are relevant—those actions that could be the
last step in a plan leading up to the current goal state.
For an action to be relevant to a goal it obviously must contribute to the goal: at least
one of the action’s effects (either positive or negative) must unify with an element of the goal.
What is less obvious is that the action must not have any effect (positive or negative) that
negates an element of the goal. Now, if the goal is A ∧ B ∧ C and an action has the effect
A∧B ∧¬C then there is a colloquial sense in which that action is very relevant to the goal—it
gets us two-thirds of the way there. But it is not relevant in the technical sense defined here,
because this action could not be the final step of a solution—we would always need at least
one more step to achieve C.
Given the goal At(C2 , SFO ), several instantiations of Unload are relevant: we could
chose any specific plane to unload from, or we could leave the plane unspecified by using
the action Unload (C2 , p , SFO ). We can reduce the branching factor without ruling out any
solutions by always using the action formed by substituting the most general unifier into the
(standardized) action schema.
As another example, consider the goal Own(0136042597), given an initial state with
10 billion ISBNs, and the single action schema
A = Action(Buy(i), P RECOND :ISBN (i), E FFECT:Own(i)) .
As we mentioned before, forward search without a heuristic would have to start enumer-
ating the 10 billion ground Buy actions. But with backward search, we would unify the
goal Own(0136042597) with the (standardized) effect Own(i ), yielding the substitution
θ = {i /0136042597}. Then we would regress over the action Subst(θ, A ) to yield the
predecessor state description ISBN (0136042597). This is part of, and thus entailed by, the
initial state, so we are done.
376 Chapter 10. Classical Planning
We can make this more formal. Assume a goal description g which contains a goal
literal gi and an action schema A that is standardized to produce A . If A has an effect literal
ej where Unify(gi , ej ) = θ and where we define a = S UBST (θ, A ) and if there is no effect
in a that is the negation of a literal in g, then a is a relevant action towards g.
Backward search keeps the branching factor lower than forward search, for most prob-
lem domains. However, the fact that backward search uses state sets rather than individual
states makes it harder to come up with good heuristics. That is the main reason why the
majority of current systems favor forward search.
Figure 10.6 Two state spaces from planning problems with the ignore-delete-lists heuris-
tic. The height above the bottom plane is the heuristic score of a state; states on the bottom
plane are goals. There are no local minima, so search for the goal is straightforward. From
Hoffmann (2005).
DECOMPOSITION A key idea in defining heuristics is decomposition: dividing a problem into parts, solv-
SUBGOAL
INDEPENDENCE ing each part independently, and then combining the parts. The subgoal independence as-
sumption is that the cost of solving a conjunction of subgoals is approximated by the sum
of the costs of solving each subgoal independently. The subgoal independence assumption
can be optimistic or pessimistic. It is optimistic when there are negative interactions between
the subplans for each subgoal—for example, when an action in one subplan deletes a goal
achieved by another subplan. It is pessimistic, and therefore inadmissible, when subplans
contain redundant actions—for instance, two actions that could be replaced by a single action
in the merged plan.
Suppose the goal is a set of fluents G, which we divide into disjoint subsets G1 , . . . , Gn .
We then find plans P1 , . . . , Pn that solve the respective subgoals. What is an estimate of the
cost of the plan for achieving all of G? We can think of each Cost (Pi ) as a heuristic estimate,
and we know that if we combine estimates by taking their maximum value, we always get an
admissible heuristic. So maxi C OST (Pi ) is admissible, and sometimes it is exactly correct:
it could be that P1 serendipitously achieves all the Gi . But in most cases, in practice the
estimate is too low. Could we sum the costs instead? For many problems that is a reasonable
estimate, but it is not admissible. The best case is when we can determine that Gi and Gj are
independent. If the effects of Pi leave all the preconditions and goals of Pj unchanged, then
the estimate C OST (Pi ) + C OST (Pj ) is admissible, and more accurate than the max estimate.
We show in Section 10.3.1 that planning graphs can help provide better heuristic estimates.
It is clear that there is great potential for cutting down the search space by forming ab-
stractions. The trick is choosing the right abstractions and using them in a way that makes
the total cost—defining an abstraction, doing an abstract search, and mapping the abstraction
back to the original problem—less than the cost of solving the original problem. The tech-
Section 10.3. Planning Graphs 379
niques of pattern databases from Section 3.6.3 can be useful, because the cost of creating
the pattern database can be amortized over multiple problem instances.
An example of a system that makes use of effective heuristics is FF, or FAST F ORWARD
(Hoffmann, 2005), a forward state-space searcher that uses the ignore-delete-lists heuristic,
estimating the heuristic with the help of a planning graph (see Section 10.3). FF then uses
hill-climbing search (modified to keep track of the plan) with the heuristic to find a solution.
When it hits a plateau or local maximum—when no action leads to a state with better heuristic
score—then FF uses iterative deepening search until it finds a state that is better, or it gives
up and restarts hill-climbing.
All of the heuristics we have suggested can suffer from inaccuracies. This section shows
PLANNING GRAPH how a special data structure called a planning graph can be used to give better heuristic
estimates. These heuristics can be applied to any of the search techniques we have seen so
far. Alternatively, we can search for a solution over the space formed by the planning graph,
using an algorithm called G RAPHPLAN .
A planning problem asks if we can reach a goal state from the initial state. Suppose we
are given a tree of all possible actions from the initial state to successor states, and their suc-
cessors, and so on. If we indexed this tree appropriately, we could answer the planning ques-
tion “can we reach state G from state S0 ” immediately, just by looking it up. Of course, the
tree is of exponential size, so this approach is impractical. A planning graph is polynomial-
size approximation to this tree that can be constructed quickly. The planning graph can’t
answer definitively whether G is reachable from S0 , but it can estimate how many steps it
takes to reach G. The estimate is always correct when it reports the goal is not reachable, and
it never overestimates the number of steps, so it is an admissible heuristic.
LEVEL A planning graph is a directed graph organized into levels: first a level S0 for the initial
state, consisting of nodes representing each fluent that holds in S0 ; then a level A0 consisting
of nodes for each ground action that might be applicable in S0 ; then alternating levels Si
followed by Ai ; until we reach a termination condition (to be discussed later).
Roughly speaking, Si contains all the literals that could hold at time i, depending on
the actions executed at preceding time steps. If it is possible that either P or ¬P could hold,
then both will be represented in Si . Also roughly speaking, Ai contains all the actions that
could have their preconditions satisfied at time i. We say “roughly speaking” because the
planning graph records only a restricted subset of the possible negative interactions among
actions; therefore, a literal might show up at level Sj when actually it could not be true until
a later level, if at all. (A literal will never show up too late.) Despite the possible error, the
level j at which a literal first appears is a good estimate of how difficult it is to achieve the
literal from the initial state.
Planning graphs work only for propositional planning problems—ones with no vari-
ables. As we mentioned on page 368, it is straightforward to propositionalize a set of ac-
380 Chapter 10. Classical Planning
Init (Have(Cake))
Goal (Have(Cake) ∧ Eaten(Cake))
Action(Eat (Cake)
P RECOND : Have(Cake)
E FFECT: ¬ Have(Cake) ∧ Eaten(Cake))
Action(Bake(Cake)
P RECOND : ¬ Have(Cake)
E FFECT: Have(Cake))
Figure 10.7 The “have cake and eat cake too” problem.
S0 A0 S1 A1 S2
Bake(Cake)
Have(Cake) Have(Cake) Have(Cake)
¬ Have(Cake) ¬ Have(Cake)
Eat(Cake) Eat(Cake)
Eaten(Cake) Eaten(Cake)
¬ Eaten(Cake) ¬ Eaten(Cake) ¬ Eaten(Cake)
Figure 10.8 The planning graph for the “have cake and eat cake too” problem up to level
S2 . Rectangles indicate actions (small squares indicate persistence actions), and straight
lines indicate preconditions and effects. Mutex links are shown as curved gray lines. Not all
mutex links are shown, because the graph would be too cluttered. In general, if two literals
are mutex at Si , then the persistence actions for those literals will be mutex at Ai and we
need not draw that mutex link.
tion schemas. Despite the resulting increase in the size of the problem description, planning
graphs have proved to be effective tools for solving hard planning problems.
Figure 10.7 shows a simple planning problem, and Figure 10.8 shows its planning
graph. Each action at level Ai is connected to its preconditions at Si and its effects at Si+1 .
So a literal appears because an action caused it, but we also want to say that a literal can
PERSISTENCE
ACTION persist if no action negates it. This is represented by a persistence action (sometimes called
a no-op). For every literal C, we add to the problem a persistence action with precondition C
and effect C. Level A0 in Figure 10.8 shows one “real” action, Eat(Cake), along with two
persistence actions drawn as small square boxes.
Level A0 contains all the actions that could occur in state S0 , but just as important it
records conflicts between actions that would prevent them from occurring together. The gray
MUTUAL EXCLUSION lines in Figure 10.8 indicate mutual exclusion (or mutex) links. For example, Eat(Cake) is
MUTEX mutually exclusive with the persistence of either Have(Cake) or ¬Eaten(Cake). We shall
see shortly how mutex links are computed.
Level S1 contains all the literals that could result from picking any subset of the actions
in A0 , as well as mutex links (gray lines) indicating literals that could not appear together,
regardless of the choice of actions. For example, Have(Cake) and Eaten(Cake) are mutex:
Section 10.3. Planning Graphs 381
depending on the choice of actions in A0 , either, but not both, could be the result. In other
words, S1 represents a belief state: a set of possible states. The members of this set are all
subsets of the literals such that there is no mutex link between any members of the subset.
We continue in this way, alternating between state level Si and action level Ai until we
reach a point where two consecutive levels are identical. At this point, we say that the graph
LEVELED OFF has leveled off. The graph in Figure 10.8 levels off at S2 .
What we end up with is a structure where every Ai level contains all the actions that are
applicable in Si , along with constraints saying that two actions cannot both be executed at the
same level. Every Si level contains all the literals that could result from any possible choice
of actions in Ai−1 , along with constraints saying which pairs of literals are not possible.
It is important to note that the process of constructing the planning graph does not require
choosing among actions, which would entail combinatorial search. Instead, it just records the
impossibility of certain choices using mutex links.
We now define mutex links for both actions and literals. A mutex relation holds between
two actions at a given level if any of the following three conditions holds:
• Inconsistent effects: one action negates an effect of the other. For example, Eat(Cake)
and the persistence of Have(Cake) have inconsistent effects because they disagree on
the effect Have(Cake).
• Interference: one of the effects of one action is the negation of a precondition of the
other. For example Eat(Cake) interferes with the persistence of Have(Cake) by negat-
ing its precondition.
• Competing needs: one of the preconditions of one action is mutually exclusive with a
precondition of the other. For example, Bake(Cake) and Eat(Cake) are mutex because
they compete on the value of the Have(Cake) precondition.
A mutex relation holds between two literals at the same level if one is the negation of the other
or if each possible pair of actions that could achieve the two literals is mutually exclusive.
This condition is called inconsistent support. For example, Have(Cake) and Eaten(Cake)
are mutex in S1 because the only way of achieving Have(Cake), the persistence action, is
mutex with the only way of achieving Eaten(Cake), namely Eat(Cake). In S2 the two
literals are not mutex, because there are new ways of achieving them, such as Bake(Cake)
and the persistence of Eaten(Cake), that are not mutex.
A planning graph is polynomial in the size of the planning problem. For a planning
problem with l literals and a actions, each Si has no more than l nodes and l2 mutex links,
and each Ai has no more than a + l nodes (including the no-ops), (a + l)2 mutex links, and
2(al + l) precondition and effect links. Thus, an entire graph with n levels has a size of
O(n(a + l)2 ). The time to build the graph has the same complexity.
LEVEL COST level cost of gi . In Figure 10.8, Have(Cake) has level cost 0 and Eaten(Cake) has level cost
1. It is easy to show (Exercise 10.10) that these estimates are admissible for the individual
goals. The estimate might not always be accurate, however, because planning graphs allow
several actions at each level, whereas the heuristic counts just the level and not the number
SERIAL PLANNING
GRAPH of actions. For this reason, it is common to use a serial planning graph for computing
heuristics. A serial graph insists that only one action can actually occur at any given time
step; this is done by adding mutex links between every pair of nonpersistence actions. Level
costs extracted from serial graphs are often quite reasonable estimates of actual costs.
To estimate the cost of a conjunction of goals, there are three simple approaches. The
MAX-LEVEL max-level heuristic simply takes the maximum level cost of any of the goals; this is admissi-
ble, but not necessarily accurate.
LEVEL SUM The level sum heuristic, following the subgoal independence assumption, returns the
sum of the level costs of the goals; this can be inadmissible but works well in practice
for problems that are largely decomposable. It is much more accurate than the number-
of-unsatisfied-goals heuristic from Section 10.2. For our problem, the level-sum heuristic
estimate for the conjunctive goal Have(Cake) ∧ Eaten(Cake) will be 0 + 1 = 1, whereas
the correct answer is 2, achieved by the plan [Eat(Cake), Bake(Cake)]. That doesn’t seem
so bad. A more serious error is that if Bake(Cake) were not in the set of actions, then the
estimate would still be 1, when in fact the conjunctive goal would be impossible.
SET-LEVEL Finally, the set-level heuristic finds the level at which all the literals in the conjunctive
goal appear in the planning graph without any pair of them being mutually exclusive. This
heuristic gives the correct values of 2 for our original problem and infinity for the problem
without Bake(Cake). It is admissible, it dominates the max-level heuristic, and it works
extremely well on tasks in which there is a good deal of interaction among subplans. It is not
perfect, of course; for example, it ignores interactions among three or more literals.
As a tool for generating accurate heuristics, we can view the planning graph as a relaxed
problem that is efficiently solvable. To understand the nature of the relaxed problem, we
need to understand exactly what it means for a literal g to appear at level Si in the planning
graph. Ideally, we would like it to be a guarantee that there exists a plan with i action levels
that achieves g, and also that if g does not appear, there is no such plan. Unfortunately,
making that guarantee is as difficult as solving the original planning problem. So the planning
graph makes the second half of the guarantee (if g does not appear, there is no plan), but
if g does appear, then all the planning graph promises is that there is a plan that possibly
achieves g and has no “obvious” flaws. An obvious flaw is defined as a flaw that can be
detected by considering two actions or two literals at a time—in other words, by looking at
the mutex relations. There could be more subtle flaws involving three, four, or more actions,
but experience has shown that it is not worth the computational effort to keep track of these
possible flaws. This is similar to a lesson learned from constraint satisfaction problems—that
it is often worthwhile to compute 2-consistency before searching for a solution, but less often
worthwhile to compute 3-consistency or higher. (See page 211.)
One example of an unsolvable problem that cannot be recognized as such by a planning
graph is the blocks-world problem where the goal is to get block A on B, B on C, and C on
A. This is an impossible goal; a tower with the bottom on top of the top. But a planning graph
Section 10.3. Planning Graphs 383
cannot detect the impossibility, because any two of the three subgoals are achievable. There
are no mutexes between any pair of literals, only between the three as a whole. To detect that
this problem is impossible, we would have to search over the planning graph.
Figure 10.9 The G RAPHPLAN algorithm. G RAPHPLAN calls E XPAND -G RAPH to add a
level until either a solution is found by E XTRACT-S OLUTION, or no solution is possible.
Let us now trace the operation of G RAPHPLAN on the spare tire problem from page 370.
The graph is shown in Figure 10.10. The first line of G RAPHPLAN initializes the planning
graph to a one-level (S0 ) graph representing the initial state. The positive fluents from the
problem description’s initial state are shown, as are the relevant negative fluents. Not shown
are the unchanging positive literals (such as Tire(Spare )) and the irrelevant negative literals.
The goal At(Spare , Axle) is not present in S0 , so we need not call E XTRACT-S OLUTION —
we are certain that there is no solution yet. Instead, E XPAND -G RAPH adds into A0 the three
actions whose preconditions exist at level S0 (i.e., all the actions except PutOn(Spare , Axle)),
along with persistence actions for all the literals in S0 . The effects of the actions are added at
level S1 . E XPAND -G RAPH then looks for mutex relations and adds them to the graph.
At(Spare , Axle) is still not present in S1 , so again we do not call E XTRACT-S OLUTION .
We call E XPAND -G RAPH again, adding A1 and S1 and giving us the planning graph shown
in Figure 10.10. Now that we have the full complement of actions, it is worthwhile to look at
some of the examples of mutex relations and their causes:
• Inconsistent effects: Remove(Spare , Trunk ) is mutex with LeaveOvernight because
one has the effect At(Spare , Ground ) and the other has its negation.
384 Chapter 10. Classical Planning
S0 A0 S1 A1 S2
At(Spare,Trunk) At(Spare,Trunk) At(Spare,Trunk)
Remove(Spare,Trunk)
Figure 10.10 The planning graph for the spare tire problem after expansion to level S2 .
Mutex links are shown as gray lines. Not all links are shown, because the graph would be too
cluttered if we showed them all. The solution is indicated by bold lines and outlines.
• Interference: Remove(Flat , Axle) is mutex with LeaveOvernight because one has the
precondition At(Flat , Axle) and the other has its negation as an effect.
• Competing needs: PutOn(Spare , Axle) is mutex with Remove(Flat , Axle) because
one has At(Flat , Axle) as a precondition and the other has its negation.
• Inconsistent support: At(Spare , Axle) is mutex with At(Flat , Axle) in S2 because the
only way of achieving At(Spare , Axle) is by PutOn(Spare , Axle), and that is mutex
with the persistence action that is the only way of achieving At(Flat , Axle). Thus, the
mutex relations detect the immediate conflict that arises from trying to put two objects
in the same place at the same time.
This time, when we go back to the start of the loop, all the literals from the goal are present
in S2 , and none of them is mutex with any other. That means that a solution might exist,
and E XTRACT-S OLUTION will try to find it. We can formulate E XTRACT-S OLUTION as a
Boolean constraint satisfaction problem (CSP) where the variables are the actions at each
level, the values for each variable are in or out of the plan, and the constraints are the mutexes
and the need to satisfy each goal and precondition.
Alternatively, we can define E XTRACT-S OLUTION as a backward search problem, where
each state in the search contains a pointer to a level in the planning graph and a set of unsat-
isfied goals. We define this search problem as follows:
• The initial state is the last level of the planning graph, Sn , along with the set of goals
from the planning problem.
• The actions available in a state at level Si are to select any conflict-free subset of the
actions in Ai−1 whose effects cover the goals in the state. The resulting state has level
Si−1 and has as its set of goals the preconditions for the selected set of actions. By
“conflict free,” we mean a set of actions such that no two of them are mutex and no two
of their preconditions are mutex.
Section 10.3. Planning Graphs 385
• The goal is to reach a state at level S0 such that all the goals are satisfied.
• The cost of each action is 1.
For this particular problem, we start at S2 with the goal At(Spare , Axle). The only choice we
have for achieving the goal set is PutOn(Spare , Axle). That brings us to a search state at S1
with goals At(Spare , Ground ) and ¬At(Flat , Axle). The former can be achieved only by
Remove(Spare , Trunk ), and the latter by either Remove(Flat , Axle) or LeaveOvernight.
But LeaveOvernight is mutex with Remove(Spare , Trunk ), so the only solution is to choose
Remove(Spare , Trunk ) and Remove(Flat , Axle). That brings us to a search state at S0 with
the goals At(Spare , Trunk ) and At(Flat , Axle). Both of these are present in the state, so
we have a solution: the actions Remove(Spare , Trunk ) and Remove(Flat , Axle) in level
A0 , followed by PutOn(Spare , Axle) in A1 .
In the case where E XTRACT-S OLUTION fails to find a solution for a set of goals at
a given level, we record the (level , goals) pair as a no-good, just as we did in constraint
learning for CSPs (page 220). Whenever E XTRACT-S OLUTION is called again with the same
level and goals, we can find the recorded no-good and immediately return failure rather than
searching again. We see shortly that no-goods are also used in the termination test.
We know that planning is PSPACE-complete and that constructing the planning graph
takes polynomial time, so it must be the case that solution extraction is intractable in the worst
case. Therefore, we will need some heuristic guidance for choosing among actions during the
backward search. One approach that works well in practice is a greedy algorithm based on
the level cost of the literals. For any set of goals, we proceed in the following order:
1. Pick first the literal with the highest level cost.
2. To achieve that literal, prefer actions with easier preconditions. That is, choose an action
such that the sum (or maximum) of the level costs of its preconditions is smallest.
Now all we have to do is prove that the graph and the no-goods will always level off. The
key to this proof is that certain properties of planning graphs are monotonically increasing or
decreasing. “X increases monotonically” means that the set of Xs at level i + 1 is a superset
(not necessarily proper) of the set at level i. The properties are as follows:
• Literals increase monotonically: Once a literal appears at a given level, it will appear
at all subsequent levels. This is because of the persistence actions; once a literal shows
up, persistence actions cause it to stay forever.
• Actions increase monotonically: Once an action appears at a given level, it will appear
at all subsequent levels. This is a consequence of the monotonic increase of literals; if
the preconditions of an action appear at one level, they will appear at subsequent levels,
and thus so will the action.
• Mutexes decrease monotonically: If two actions are mutex at a given level Ai , then they
will also be mutex for all previous levels at which they both appear. The same holds for
mutexes between literals. It might not always appear that way in the figures, because
the figures have a simplification: they display neither literals that cannot hold at level
Si nor actions that cannot be executed at level Ai . We can see that “mutexes decrease
monotonically” is true if you consider that these invisible literals and actions are mutex
with everything.
The proof can be handled by cases: if actions A and B are mutex at level Ai , it
must be because of one of the three types of mutex. The first two, inconsistent effects
and interference, are properties of the actions themselves, so if the actions are mutex
at Ai , they will be mutex at every level. The third case, competing needs, depends on
conditions at level Si : that level must contain a precondition of A that is mutex with
a precondition of B. Now, these two preconditions can be mutex if they are negations
of each other (in which case they would be mutex in every level) or if all actions for
achieving one are mutex with all actions for achieving the other. But we already know
that the available actions are increasing monotonically, so, by induction, the mutexes
must be decreasing.
• No-goods decrease monotonically: If a set of goals is not achievable at a given level,
then they are not achievable in any previous level. The proof is by contradiction: if they
were achievable at some previous level, then we could just add persistence actions to
make them achievable at a subsequent level.
Because the actions and literals increase monotonically and because there are only a finite
number of actions and literals, there must come a level that has the same number of actions
and literals as the previous level. Because mutexes and no-goods decrease, and because there
can never be fewer than zero mutexes or no-goods, there must come a level that has the
same number of mutexes and no-goods as the previous level. Once a graph has reached this
state, then if one of the goals is missing or is mutex with another goal, then we can stop the
G RAPHPLAN algorithm and return failure. That concludes a sketch of the proof; for more
details see Ghallab et al. (2004).
Section 10.4. Other Classical Planning Approaches 387
Figure 10.11 Some of the top-performing systems in the International Planning Compe-
tition. Each year there are various tracks: “Optimal” means the planners must produce the
shortest possible plan, while “Satisficing” means nonoptimal solutions are accepted. “Hand-
coded” means domain-specific heuristics are allowed; “Automated” means they are not.
Currently the most popular and effective approaches to fully automated planning are:
• Translating to a Boolean satisfiability (SAT) problem
• Forward state-space search with carefully crafted heuristics (Section 10.2)
• Search using a planning graph (Section 10.3)
These three approaches are not the only ones tried in the 40-year history of automated plan-
ning. Figure 10.11 shows some of the top systems in the International Planning Competitions,
which have been held every even year since 1998. In this section we first describe the transla-
tion to a satisfiability problem and then describe three other influential approaches: planning
as first-order logical deduction; as constraint satisfaction; and as plan refinement.