B Resolution
B Resolution
Propositional Resolution
Resolution is an inference rule, permitting one to derive sentences that are log-
ically implied by a knowledge base, based on the presence of certain syntactic
patterns in the knowledge base. Resolution is one of the most influential infer-
ence rules as it plays a significant role in various forms of logical reasoning. The
aim of this chapter is to introduce the inference rule of resolution, discuss its
main properties, and overview some of its applications to logical reasoning.
1
2 Class Notes for CS264A, UCLA
Note that according to our definition, a clause cannot contain the literals
P and ¬P simultaneously. Hence, a clause can never be valid. Note also that
a clause with no literals, the empty clause, corresponds to an inconsistent sen-
tence (since an empty disjunction is an inconsistent sentence). Furthermore, a
CNF with no clauses is a valid sentence (since an empty conjunction is a valid
sentence).
This set-based notation will prove very helpful when expressing algorithms that
operate on CNFs.
Consider now the following resolution trace, which exemplifies the way in
which we will depict resolution proofs:
1 The insistence that all literals in a clause be over distinct variables is not standard.
2 Again, the insistence that all literals in a term be over distinct variables is not standard.
Adnan Darwiche c 2019 3
1. {¬P, R}
2. {¬Q, R}
3. {¬R}
4. {P, Q}
5. {¬P } 1, 3
6. {¬Q} 2, 3
7. {Q} 4, 5
8. {} 6, 7
The clauses before the line represent initial clauses, while clauses below the
line represent resolvents, together with the identifiers of clauses used to obtain
them. Recall that the empty clause is an inconsistent sentence. Hence, the
above resolution trace shows that the initial set of Clauses (1–4) is inconsistent
since it implies an inconsistent sentence.
Given the semantics of clausal form it then follows that a clausal form ∆ is
valid if ∆ is the empty set: ∆ = ∅. Moreover, a clausal form ∆ is inconsistent
if ∆ contains the empty set: ∅ ∈ ∆. These two cases correspond to common
boundary conditions that arise in recursive algorithms on clausal forms. Note
also that if a clause Ci is a subset of another clause Cj , Ci ⊆ Cj , it then follows
that Ci |= Cj . We say in this case that clause Ci subsumes clause Cj , as there is
no need to have clause Cj in a clausal form given that clause Ci already appears
in the form.
Step 1: Remove all logical connectives except for conjunction, disjunction, and
negation. For example, a schema α⇒β should be transformed into ¬α ∨ β
and similarly for other connectives.
Step 2: Push negations inside the sentence until they only appear next to propo-
sitional variables. This is done by repeated application of the following
transformations:
Step 1: ¬(A ∨ B) ∨ C
Step 2: (¬A ∧ ¬B) ∨ C
Step 3: (¬A ∨ C) ∧ (¬B ∨ C)
Step 4: { {¬A, C}, {¬B, C} }
Step 1: ¬(¬(A ∨ B) ∨ C)
Step 2: (A ∨ B) ∧ ¬C
Step 3: (A ∨ B) ∧ ¬C
Step 4: { {A, B}, {¬C} }
Step 1: ¬P ∨ (¬Q ∨ R)
Step 2: ¬P ∨ ¬Q ∨ R
Step 3: ¬P ∨ ¬Q ∨ R
Step 4: { {¬P, ¬Q, R} }
Finally, to convert the sentence ¬((P ⇒Q) ∧ (R⇒S)) into clausal form:
A D
E
C
Figure 2.1: A connectivity graph for a clausal form.
3. The set of clauses α that contain neither P nor ¬P . These clauses appear
in ∆|P without change.
One can similarly express the conditioning of a clausal form on a negative literal:
∆|¬P = {α \ {P } | α ∈ ∆, ¬P 6∈ α}.
For example, if
then
∆|C = { {A, B}, {¬A, D} },
and
∆|¬C = { {¬A, D}, {B, D} }.
Note also that ∆|CA¬D = {∅} in this case and, hence, is inconsistent. More-
over, ∆|¬CD = ∅ in this case, and is therefore valid.
∆ = { {¬A, B}, {¬A, C}, {¬B, D}, {¬C, ¬D}, {A, ¬C, E} }.
6 Class Notes for CS264A, UCLA
A connectivity graph does not depend on the sign of literals that appear in
clauses, as it only depends on the variables that appear in such clauses. For
example, the connectivity graph in Figure 2.1 also corresponds to the following
clausal form:
1. {¬A, C} ∆
2. {¬B, C} ∆
3. {¬C, D} ∆
4. {¬D} ¬α
5. {A} ¬α
6. {¬C} 3, 4
7. {¬A} 1, 6
8. {} 5, 7
1. {¬A, C} ∆
2. {¬B, C} ∆
3. {¬C, D} ∆
4. {D} ¬β
5. {¬B} ¬β
6. {¬A, D} 1, 3
7. {¬B, D} 2, 3
8 Class Notes for CS264A, UCLA
{C} {¬C}
{}
Given that the empty clause could not be derived using resolution, we are guar-
anteed that ∆ ∧ ¬β is consistent and, hence, ∆ 6|= β.
Resolution satisfies another completeness property: If a clausal form ∆ im-
plies a clause C, then applying resolution to ∆ is guaranteed to generate a clause
which is a subset of C; that is, a clause that subsumes C. Consider the same
knowledge base ∆ given above and the query:
γ= ¬D ∧ E ⇒ ¬A ∧ ¬B .
1. {¬A, C} ∆
2. {¬B, C} ∆
3. {¬C, D} ∆
4. {¬A, D} 1, 3
5. {¬B, D} 2, 3
Since Clause 4 subsumes Clause 1.γ, and Clause 5 subsumes the Clause 2.γ, we
then have proven that ∆ |= γ.
{C}
{D}
{¬C}
{}
Figure 2.2. Each node in a resolution graph is either a root or has two parents. A
root node corresponds to a clause that exists in the initial knowledge and, hence,
was not obtained using resolution. A node that has two parents corresponds to a
clause which was obtained by resolving the clauses corresponding to its parents.
In the resolution graph of Figure 2.2, the parents of clause {} are clauses {C}
and {¬C}. If there is a directed path from clause Ci to clause Cj in a resolution
graph, we say that Ci is an ancestor of clause Cj . In Figure 2.2, clause {¬C, D}
is an ancestor of clause {}. Some clauses in the knowledge base may not appear
in a resolution graph, indicating that the clauses were not used in the resolution
process. Resolution graphs are quite useful in describing resolution strategies,
which are discussed next.
2. For every unit clause C = {¬P } in ∆, set the value of variable P to false
in model ω, therefore, satisfying clause C.
Another class which is closely connected to Horn clausal form is the renam-
able Horn clausal form. This is a clausal form which is not Horn, but can be
transformed into a Horn clausal form through a systematic process of renaming
variables. Consider the following clausal form:
which is not Horn since the first clause contains more than one positive literal.
Applying the following substitutions to the literals of ∆:
Pi → ¬Ri
¬Pi → Ri
Q → ¬S
¬Q → S,
C, the value of P must be set to true (since C is not satisfied). This means that the knowledge
base must contain a unit clause {P } for each one of these literals, since only Step 1 can set
a variable value to true. But this means that unit resolution should have yielded the empty
clause {} if C contains no positive literals, or the unit clause {Q} if C contains the positive
literal Q. In either case, we have a contradiction with our assumptions.
Adnan Darwiche c 2019 13
{ {¬A, B}, {¬A, C}, {¬B, D}, {¬C, ¬D}, {A, ¬C, E} }.
Let us now apply directed resolution using the order C, B, A, D, E. There are
two C-resolvents, but one of them is a tautology (it includes both A and ¬A), so
we are left only with {¬A, ¬D}. Moreover, there is one B-resolvent: {¬A, D}.
There are a number of A-resolvents but all involve variables that come before
A in the order; hence, they are not allowed by directed resolution. Similarly,
none of the D-resolvents are allowed by directed resolution. Finally, there are
no E-resolvents since variable E appears only positively in the clausal form.
Hence, directed resolution under the previous variable order generates only two
clauses. Since none of these clauses is the empty clause, and given that directed
resolution is refutation complete, we conclude that the original knowledge base
must be consistent.
The utility of directed resolution goes far beyond refutation, as it can be used
to forget variables and enumerate/count models. Moreover, the complexity of
directed resolution can be analyzed and bounded in terms of the structural
properties of given clausal form. Before we discuss these subjects, we introduce
a mechanism for applying directed resolution in a systematic fashion, which is
also quite useful in developing more insights into the method.
The mechanism is known as bucket elimination and proceeds in two stages,
in which we first construct and fill a set of buckets, and then process them.
Specifically, given a variable ordering π, we construct and fill buckets as follows:
• Each clause α in the clausal form is added to the first Bucket P , such that
variable P appears in clause α.
is generated, we say that bucket elimination has been aborted. The set of clauses
that appear in all buckets after bucket elimination has been completed is called
the directed extension of the given clausal form and variable order.
Consider again the clausal form given above and the variable order C, B, A, D, E.
Constructing and filling buckets leads to:4
C : {¬A, C}, {¬C, ¬D}, {A, ¬C, E}
B : {¬A, B}, {¬B, D}
A:
D:
E:
Processing Bucket C adds one C-resolvent to Bucket A:
C : {¬A, C}, {¬C, ¬D}, {A, ¬C, E}
B : {¬A, B}, {¬B, D}
A: {¬A, ¬D}
D:
E:
Processing Bucket B adds one B-resolvent to Bucket A:
C : {¬A, C}, {¬C, ¬D}, {A, ¬C, E}
B : {¬A, B}, {¬B, D}
A: {¬A, ¬D}, {¬A, D}
D:
E:
Processing Buckets A, D and finally Bucket E leads to no new resolvents.
The main value of bucket elimination is that it allows us to easily enforce
the constraints imposed by directed resolution. In particular, when process-
ing a bucket, we are guaranteed that only clauses appearing in the bucket can
participate in generating resolvents. Moreover, we shall see later that the di-
rected extension of a clausal form, which is available once bucket elimination is
completed, can be used to extract valuable information about the given clausal
form.
The refutation completeness of directed resolution holds for any variable
order. The amount of work performed by directed resolution, however, is quite
dependent on the chosen order. We shall visit this issue at some length later, but
we illustrate it here with one example. Consider again the previous example
and suppose we choose the order E, A, B, C, D instead. Our buckets look as
follows initially:
E : {A, ¬C, E}
A : {¬A, B} {¬A, C}
B : {¬B, D}
C : {¬C, ¬D}
D:
4 It is not uncommon for buckets to be empty. It is also possible for all clauses to fall in
Processing the above buckets yields no resolvents in this case! Hence, directed
resolution with respect to the above order generates no resolvents, even though
general resolution would generate a number of resolvents in this case.
If bucket elimination is completed (that is, the empty clause was not gen-
erated), we are guaranteed that the original clausal form is consistent. This
follows from the fact that bucket elimination simply implements directed res-
olution, and from the fact that directed resolution is refutation complete. It
turns out however that directed resolution is too powerful to be only used for
testing consistency as it can be used to forget variables, to count models, and
to systematically generate models. We discuss these applications next.
2.4.1 Forgetting
The use of directed resolution (and bucket elimination) in forgetting variables is
due to the following key result. Suppose that ∆ is a clausal form, and let Γ be
another clausal form which results from adding to ∆ all P -resolvents, and then
throwing out all clauses that mention P (hence, Γ does not mention variable
P ). It follows in this case that Γ is equivalent to ∃P ∆. Consider the following
clausal form,
∆ = { {¬A, B}, {¬B, C} }.
There is only one B-resolvent in this case: {¬A, C}. Adding this resolvent to
∆, and throwing out those clauses that mention B gives:
Γ = { {¬A, C} }.
A ⇒ B
B ⇒ C
∆=
C ⇒ D
D ⇒ E,
The last two buckets contain the clause {¬A, E}, which is the result of forgetting
variables B, C, D from the original knowledge base ∆. That is, A⇒E is all that
knowledge base ∆ says about variables A, E in this case. The last bucket is
empty, which says that knowledge base ∆ says nothing about variable E alone.
Suppose now that the knowledge base ∆ is augmented with the unit clause
{D}. The directed extension of ∆ is then:
C : {¬B, C} {¬C, D}
B : {¬A, B} {¬B, D}
D : {¬D, E} {D} {¬A, D}
A: {¬A, E}
E: {E}.
The last bucket contains clause {E} in this case, which means that literal E is
all that the knowledge base ∆ ∪ {D} says about variable E.
For another example, consider the knowledge base:
¬B ∨ ¬C ⇒ ¬A
B ⇒ D
∆=
C ⇒ ¬D
¬A ∧ C ⇒ E,
The last two buckets contain the clause {¬C, E}, which means that C⇒E is all
that knowledge base ∆ says about variables C and E.
¬B ∨ ¬C ⇒ ¬A
B ⇒ D
∆=
C ⇒ ¬D
¬A ∧ C ⇒ E,
Adnan Darwiche c 2019 17
false B
C C
false D D true
true false
Figure 2.4: A decision tree. The high child of a node is shown on the left and
its low child is shown on the right.
and the decision tree in Figure 2.4. This decision tree characterizes the models
of ∆ in the following sense. Given any world ω, we can check whether ω is a
model of ∆ as follows. We start at the root, and for each node labelled with
variable P , we go to the high child if ω(P ) = true and to the low child if
ω(p) = false. If we end up at a leaf node labelled with true, we know that ω is
a model of ∆. If we end up at a leaf node labelled with false, we know that ω
is not a model of ∆.
The main property of a decision tree for a knowledge base ∆ is that the size
of the tree is O(|Mods(∆)|); hence, the size of the decision tree is bounded by
the number of models. Moreover, one can enumerate the models of ∆ in time
which is linear in the size of decision tree. One can also count these models in
time which is linear in the size of decision tree.
¬A ∧ ¬B ∧ ¬C ∧ D ∧ E
¬A ∧ ¬B ∧ ¬C ∧ D ∧ ¬E
¬A ∧ ¬B ∧ ¬C ∧ ¬D ∧ E
¬A ∧ ¬B ∧ ¬C ∧ ¬D ∧ ¬E.
The other major thing to observe is that the models characterized by one leaf
node labelled with true are disjoint from the set of worlds characterized by
any other leaf node. Hence, we can easily count the total number of models
characterized by a decision tree by simply visiting each leaf node labelled with
true, while accumulating the number of models it characterizes. To enumerate
models, we follow the same procedure except that we explicitly enumerate the
models characterized by that leaf node. For example, the decision tree in Fig-
ure 2.4 characterizes 7 = 2 + 1 + 4 models, four of which have been enumerated
above.
As we shall see next, given a knowledge base ∆ and its directed extension,
we can build a decision tree that characterizes the models of ∆ in O(s1 s2 ) time,
where s1 is the size of the directed extension of ∆, and s2 is its number of models.
Specifically, suppose that Γ1 , . . . , Γn are the ordered buckets constituting the
directed extension of ∆, where variable Pi is the label of Bucket Γi . Then calling
Algorithm 1 on (Γ1 , . . . , Γn ; true) will return a decision tree for knowledge base
∆. For an example, consider the knowledge base,
¬B ∨ ¬C ⇒ ¬A,
B ⇒ D,
∆=
C ⇒ ¬D,
¬A ∧ C ⇒ E,
ΓE : {A, ¬C, E}
ΓD : {¬B, D} {¬C, ¬D}
ΓC : {¬A, C} {¬B, ¬C}
ΓB : {¬A, B} {¬A, ¬B}
ΓA : {¬A}.
B B
A D D
E C E C
GA GB
D D
E C E E
GC GD GE
Note that the original clausal form is shown on the left, with clauses added
by directed resolution shown on the right. Consider now the sequence of con-
nectivity graphs depicted in Figure 2.5. The first graph, GA , corresponds to
clauses in Buckets A–E, just before processing Bucket A. The second graph,
GB , corresponds to clauses in Buckets B–E, just before processing Bucket B.
The third graph, GC , corresponds to clauses in Buckets C–E, just before pro-
cessing Bucket C. And so on. A number of observations are in order about
these successive connectivity graphs:
• Some of the graphs contains dotted edges. These edges correspond to
clauses that were added while applying directed resolutions.
• The neighbors of variable A in graph GA are exactly the variables appear-
Adnan Darwiche c 2019 21
2.5.2 Treewidth
The above complexity analysis of directed resolution can be made more formal
using the notion of treewidth. Let G be a graph, and let P1 , . . . , Pn be an
ordering of nodes in G. Suppose that we eliminate variables P1 , . . . , Pn from
G in that order, where eliminating variable Pi is accomplished by pair-wise
22 Class Notes for CS264A, UCLA
connecting all neighbors of Pi in the graph, and then removing Pi from the
graph. Let di represent the number of neighbors that variable Pi has before it
is eliminated. The width of variable order P1 , . . . , Pn with respect to graph G is
then defined as maxni=1 di . Moreover, the treewidth of graph G is defined as the
smallest width attained by any variable order with respect to G.
Therefore, given a clausal form ∆ with connectivity graph G, and given a
variable ordering π which has width w with respect to G, the space complexity
of directed resolution using order π is O(n3w+1 ), and its time complexity is
2
O(n(3w+1 ) ), where n is the number of variables in ∆. Note that the quality
of these upper bounds depend on the width w of order π: they can be useless
when w is large enough. Moreover, the smallest that these bounds can get is
when the width of order π is the same as the treewidth of graph G. That is,
when π is an optimal order with respect to graph G.
It is important to note that the width of an order can be high, even reaching
n, yet directed resolution can be very efficient. Consider for example the knowl-
edge base: P1 ⇒P2 , P1 ⇒P3 , . . . , P1 ⇒Pn . The width of order π = P1 , P2 , . . . , Pn
is n − 1 in this case. Therefore, our bounds above are O(n3n ) for space and
2
O(n(3n ) ) for time, which are useless. Note, however, that applying directed
resolution using order π leads to adding no resolvents whatsoever. This is
an example where the worst-case analysis underlying the derivation of these
bounds is way different from the actual scenario. Note also here that order
π = P2 , P3 , . . . , Pn , P1 has width 1. Therefore, our bounds in this case become
2
O(n32 ) for space and O(n(32 ) ) for time.
The above analysis not only allows us to bound the computational resources
of directed resolution under different variable orders, but also suggests ways
to generate good orders based on the given connectivity graph. Basically, we
want to generate an order whose width is as small as possible, as that allows
us to establish the tightest bound on the time and space complexity of directed
resolution. Generating an optimal variable order (one with minimal width) for
arbitrary graphs is known to be NP-hard. Yet, there are efficient procedures
for a variety of graph classes, including trees and graphs that have a bounded
treewidth. It is also more common to use heuristic methods for generating vari-
able ordering, instead of using optimal methods. One of the simplest heuristics
is the min-degree heuristic, which generates a variable ordering by choosing the
variable that has the least number of neighbors in the connectivity graph. Once
a variable is chosen, it is eliminated from the graph after pair-wise connect-
ing all its neighbors, and the next variable is chosen using a similar procedure.
Consider the connectivity graph in Figure 2.1 for an example. The min-degree
heuristic would generate the variable ordering B, D, A, C, E, assuming that ties
are broken by preferring variables that are earlier in the alphabet.
Another related heuristic, which is generally more effective, is to choose the
variable whose elimination leads to adding the smallest number of new edges to
the connectivity graph. This heuristic is known as the min-fill heuristic.