0% found this document useful (0 votes)
17 views22 pages

B Resolution

Uploaded by

De Zheng Zhao
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views22 pages

B Resolution

Uploaded by

De Zheng Zhao
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

Chapter 2

Propositional Resolution

Resolution is an inference rule, permitting one to derive sentences that are log-
ically implied by a knowledge base, based on the presence of certain syntactic
patterns in the knowledge base. Resolution is one of the most influential infer-
ence rules as it plays a significant role in various forms of logical reasoning. The
aim of this chapter is to introduce the inference rule of resolution, discuss its
main properties, and overview some of its applications to logical reasoning.

2.1 Resolution and clausal form


The most general form of propositional resolution is as follows:
α ∨ β, ¬β ∨ γ
,
α∨γ
which says that if α ∨ β and ¬β ∨ γ are two schemas in a knowledge base ∆,
then α ∨ γ can be derived from ∆ using resolution. The result α ∨ γ is called a
resolvent, and we say that it was obtained by resolving α ∨ β and ¬β ∨ γ over β.
Another way to view resolution is as a form of logical chaining since it can
be equivalently expressed as follows:
¬α⇒β, β⇒γ
,
¬α⇒γ
where we simply replaced disjunctions by implications. Resolution is sound in
the sense that:
∆ ∪ {α ∨ β, ¬β ∨ γ} |= α ∨ γ,
for any knowledge base ∆. Resolution is not complete, however, in the sense
that a sentence may be logically implied by a knowledge base, yet it may not be
derivable from the knowledge base using resolution. For example, the sentence
α ∨ β is implied by the knowledge base ∆ = {α, β}, yet it is not derivable from
∆ using resolution. Resolution enjoys a more restricted, yet very useful, notion
of completeness that will be discussed later.

1
2 Class Notes for CS264A, UCLA

Resolution is typically applied to knowledge bases that are expressed in


conjunctive normal form, or the related clausal form. A clause is a disjunction of
literals over distinct variables, where a literal is either a variable or the negation
of a variable.1 A propositional sentence is in conjunctive normal form (CNF) if
it has the form α1 ∧ α2 ∧ . . . ∧ αn , where each αi is a clause. For example, the
sentence (A ∨ B ∨ ¬C) ∧ (¬A ∨ D) ∧ (B ∨ C ∨ D) is in conjunctive normal form
and contains three clauses. The negation of a clause is called a term, which is
a conjunction of literals over distinct variables. For example, the negation of
clause A ∨ B ∨ ¬C is the term ¬A ∧ ¬B ∧ C.2

Note that according to our definition, a clause cannot contain the literals
P and ¬P simultaneously. Hence, a clause can never be valid. Note also that
a clause with no literals, the empty clause, corresponds to an inconsistent sen-
tence (since an empty disjunction is an inconsistent sentence). Furthermore, a
CNF with no clauses is a valid sentence (since an empty conjunction is a valid
sentence).

A convenient way to notate sentences in conjunctive normal form is using


sets, leading to what is commonly known as clausal form. Specifically, a clause
l1 ∨ l2 ∨ . . . ∨ lm is expressed as a set of literals {l1 , l2 , . . . , lm }. Moreover, a
conjunctive normal form α1 ∧ α2 ∧ . . . ∧ αn is expressed as a set of clauses
{α1 , α2 , . . . , αn }. For example, the CNF given above would be expressed in
clausal form as:

{ {A, B, ¬C}, {¬A, D}, {B, C, D} }.

This set-based notation will prove very helpful when expressing algorithms that
operate on CNFs.

Let P be a propositional variable, and suppose that ∆ is a clausal form which


contains clauses Ci and Cj , where P ∈ Ci and ¬P ∈ Cj . It then follows that
we can derive the new clause (Ci \ {P }) ∪ (Cj \ {¬P }) from ∆ using resolution.
For example, we can resolve clause {A, B, ¬C} with clause {¬B, D} to obtain
{A, ¬C, D}. We will say that a clausal form is closed under resolution if any
clause that can be derived from the form using resolution is already in the
clausal form.

Consider now the following resolution trace, which exemplifies the way in
which we will depict resolution proofs:

1 The insistence that all literals in a clause be over distinct variables is not standard.
2 Again, the insistence that all literals in a term be over distinct variables is not standard.
Adnan Darwiche c 2019 3

1. {¬P, R}
2. {¬Q, R}
3. {¬R}
4. {P, Q}

5. {¬P } 1, 3
6. {¬Q} 2, 3
7. {Q} 4, 5
8. {} 6, 7

The clauses before the line represent initial clauses, while clauses below the
line represent resolvents, together with the identifiers of clauses used to obtain
them. Recall that the empty clause is an inconsistent sentence. Hence, the
above resolution trace shows that the initial set of Clauses (1–4) is inconsistent
since it implies an inconsistent sentence.
Given the semantics of clausal form it then follows that a clausal form ∆ is
valid if ∆ is the empty set: ∆ = ∅. Moreover, a clausal form ∆ is inconsistent
if ∆ contains the empty set: ∅ ∈ ∆. These two cases correspond to common
boundary conditions that arise in recursive algorithms on clausal forms. Note
also that if a clause Ci is a subset of another clause Cj , Ci ⊆ Cj , it then follows
that Ci |= Cj . We say in this case that clause Ci subsumes clause Cj , as there is
no need to have clause Cj in a clausal form given that clause Ci already appears
in the form.

2.1.1 Converting sentences into clausal form


One can convert any propositional sentence into a clausal form through a sys-
tematic four-step process:

Step 1: Remove all logical connectives except for conjunction, disjunction, and
negation. For example, a schema α⇒β should be transformed into ¬α ∨ β
and similarly for other connectives.

Step 2: Push negations inside the sentence until they only appear next to propo-
sitional variables. This is done by repeated application of the following
transformations:

¬¬α is transformed into α.


¬(α ∨ β) is transformed into ¬α ∧ ¬β.
¬(α ∧ β) is transformed into ¬α ∨ ¬β.

Step 3: Distribute disjunctions over conjunctions by repeated application of the


following transformation: α ∨ (β ∧ γ) is transformed to (α ∨ β) ∧ (α ∨ γ).
4 Class Notes for CS264A, UCLA

Step 4: The result of Steps 1–3 is guaranteed to be a conjunctive normal form.


Convert it into clausal form by removing the logical connectives and using
set notation instead.
To convert the sentence (A ∨ B)⇒C into clausal form, we go through the
following steps:

Step 1: ¬(A ∨ B) ∨ C
Step 2: (¬A ∧ ¬B) ∨ C
Step 3: (¬A ∨ C) ∧ (¬B ∨ C)
Step 4: { {¬A, C}, {¬B, C} }

To convert the sentence ¬(A ∨ B⇒C) into clausal form:

Step 1: ¬(¬(A ∨ B) ∨ C)
Step 2: (A ∨ B) ∧ ¬C
Step 3: (A ∨ B) ∧ ¬C
Step 4: { {A, B}, {¬C} }

To convert the sentence P ⇒(Q⇒R) into clausal form:

Step 1: ¬P ∨ (¬Q ∨ R)
Step 2: ¬P ∨ ¬Q ∨ R
Step 3: ¬P ∨ ¬Q ∨ R
Step 4: { {¬P, ¬Q, R} }

Finally, to convert the sentence ¬((P ⇒Q) ∧ (R⇒S)) into clausal form:

Step 1: ¬((¬P ∨ Q) ∧ (¬R ∨ S))


Step 2: (P ∧ ¬Q) ∨ (R ∧ ¬S)
Step 3: (P ∨ R) ∧ (P ∨ ¬S) ∧ (¬Q ∨ R) ∧ (¬Q ∨ ¬S)
Step 4: { {P, R}, {P, ¬S}, {¬Q, R}, {¬Q, ¬S} }

Although the above conversion process is guaranteed to yield a clausal form,


the result can be quite large. Specifically, it is possible that the size of given sen-
tence is linear in the number of propositional variables, yet the size of resulting
clausal form is exponential in that number.

2.1.2 Conditioning a clausal form


A number of algorithms that we shall define on clausal forms make heavy use of
conditioning. The process of conditioning a clausal form ∆ on a positive literal
P to yield another clausal form ∆|P can be described succinctly as follows:
∆|P = {α \ {¬P } | α ∈ ∆, P 6∈ α}.
To further explain this statement, note that the clauses in ∆ can be partitioned
into three sets:
Adnan Darwiche c 2019 5

A D

E
C
Figure 2.1: A connectivity graph for a clausal form.

1. The set of clauses α containing positive literal P . These clauses do not


appear in ∆|P .

2. The set of clauses α containing negative literal ¬P . These clauses appear


in ∆|P , but after removing the literal ¬P from each.

3. The set of clauses α that contain neither P nor ¬P . These clauses appear
in ∆|P without change.

One can similarly express the conditioning of a clausal form on a negative literal:

∆|¬P = {α \ {P } | α ∈ ∆, ¬P 6∈ α}.

For example, if

∆ = { {A, B, ¬C}, {¬A, D}, {B, C, D} },

then
∆|C = { {A, B}, {¬A, D} },
and
∆|¬C = { {¬A, D}, {B, D} }.
Note also that ∆|CA¬D = {∅} in this case and, hence, is inconsistent. More-
over, ∆|¬CD = ∅ in this case, and is therefore valid.

2.1.3 Connectivity graphs of clausal forms


Another concept related to a clausal form, that we shall make extensive use of,
is its connectivity graph. Specifically, the connectivity graph of a clausal form ∆
is an undirected graph G over the variables of ∆, where an edge exists between
two variables in G iff these variables appear in the same clause in ∆. Figure 2.1
depicts the connectivity graph for the following clausal form:

∆ = { {¬A, B}, {¬A, C}, {¬B, D}, {¬C, ¬D}, {A, ¬C, E} }.
6 Class Notes for CS264A, UCLA

A connectivity graph does not depend on the sign of literals that appear in
clauses, as it only depends on the variables that appear in such clauses. For
example, the connectivity graph in Figure 2.1 also corresponds to the following
clausal form:

Γ = { {A, B}, {A, C}, {B, D}, {C, D}, {A, C, E} }.

The structure of a connectivity graph can be quite helpful in revealing properties


of the underlying clausal form. We shall see, for example, that if the connectivity
graph has a tree structure, then the satisfiability of underlying clausal form can
be decided in linear time.
A connectivity graph is also helpful in obtaining information about the effect
of applying resolution. Suppose that G is the connectivity graph of clausal form
∆. A clause that results from resolving two clauses over variable P cannot
mention any variables except those that are neighbors of P in G. Consider the
clausal form ∆ given above and its connectivity graph in Figure 2.1. A clause
which results from resolving over variable D cannot mention any variables except
B and C, since these are the neighbors of variable D in the connectivity graph.
This follows because any clause that mentions variable D in the clausal form
will include only variables that are neighbors of D in the connectivity graph
(this immediately follows from the definition of a connectivity graph).
As we apply resolution and add new clauses, the corresponding connectivity
graph grows as more edges are being added. It remains the case though that
neighbors of a variable give information on the clauses containing that variable
and, hence, give information on the clauses that will be generated when resolving
over that variable. We shall make use of this property later, when we analyze a
variant of resolution, known as directed resolution.

2.1.4 Applications of resolution


There are three main uses of resolution on a clausal form ∆. First, to prove that
∆ implies some sentence α. Next, to simplify the clausal form ∆ by uncovering
all unit clauses implied by ∆. Finally, to existentially quantify or forget variables
in ∆. We consider each one of these applications next.

2.2 Deduction using resolution


The first use of resolution we shall consider is in proving logical implications.
Given a knowledge base ∆ and some propositional sentence α, our goal is to
decide whether ∆ implies α: ∆ |= α. One approach to address this problem is
to convert each of ∆ and α into clausal forms ∆0 and α0 , respectively. Then to
apply resolution to ∆0 until the clauses of α0 are derived. Note, however, that
since resolution is not complete, there is no guarantee that the clauses in α0
are derived, even though α is implied by ∆. Resolution is refutation complete
on clausal forms, however, meaning that if a clausal form is inconsistent, then
resolution is capable of deriving the empty clause. Given this property, the
Adnan Darwiche c 2019 7

common method to prove ∆ |= α is to convert ∆ ∧ ¬α into clausal form, and


then apply resolution to the result in an attempt to derive the empty clause.
We know that ∆ |= α iff ∆ ∧ ¬α is inconsistent. Hence, if we derive the empty
clause from the clausal form of ∆ ∧ ¬α, we know that ∆ |= α. Moreover, if
we exhaust all possible resolution steps and cannot derive the empty clause, we
know for sure that ∆ 6|= α since resolution is refutation complete.
Consider the following knowledge base:
 
A ∨ B ⇒ C,
∆= ,
C ⇒ D

and the query:



α= ¬D ⇒ ¬A .
The following resolution trace proves that ∆ |= α, as it shows that the empty
clause is derived from the clausal form of ∆ ∧ ¬α.

1. {¬A, C} ∆
2. {¬B, C} ∆
3. {¬C, D} ∆
4. {¬D} ¬α
5. {A} ¬α

6. {¬C} 3, 4
7. {¬A} 1, 6
8. {} 5, 7

One observation here is that the clausal form of ∆ ∧ ¬α can be obtained


by computing the clausal form of ∆, and that of ¬α, independently, and then
taking the union of the two forms.
Suppose now that the query is:

β= D ⇒ B .

The following trace contains all possible resolutions in this case:

1. {¬A, C} ∆
2. {¬B, C} ∆
3. {¬C, D} ∆
4. {D} ¬β
5. {¬B} ¬β

6. {¬A, D} 1, 3
7. {¬B, D} 2, 3
8 Class Notes for CS264A, UCLA

{¬A, C} {¬B, C} {¬C , D} {A} {¬D}

{C} {¬C}

{}

Figure 2.2: A resolution graph.

Given that the empty clause could not be derived using resolution, we are guar-
anteed that ∆ ∧ ¬β is consistent and, hence, ∆ 6|= β.
Resolution satisfies another completeness property: If a clausal form ∆ im-
plies a clause C, then applying resolution to ∆ is guaranteed to generate a clause
which is a subset of C; that is, a clause that subsumes C. Consider the same
knowledge base ∆ given above and the query:

γ= ¬D ∧ E ⇒ ¬A ∧ ¬B .

Converting γ to clausal form, we get two clauses:

1.γ {¬A, D, ¬E}


2.γ {¬B, D, ¬E}

Closing the clausal form of ∆ under resolution, we obtain:

1. {¬A, C} ∆
2. {¬B, C} ∆
3. {¬C, D} ∆

4. {¬A, D} 1, 3
5. {¬B, D} 2, 3

Since Clause 4 subsumes Clause 1.γ, and Clause 5 subsumes the Clause 2.γ, we
then have proven that ∆ |= γ.

2.2.1 Resolution graphs


One way to depict a resolution trace is through a resolution graph, which is
a directed acyclic graph (DAG) in which nodes are labelled with clauses; see
Adnan Darwiche c 2019 9

{¬A, C} {¬C , D} {A} {¬C , ¬D}

{C}

{D}

{¬C}

{}

Figure 2.3: A resolution graph obtained using linear resolution.

Figure 2.2. Each node in a resolution graph is either a root or has two parents. A
root node corresponds to a clause that exists in the initial knowledge and, hence,
was not obtained using resolution. A node that has two parents corresponds to a
clause which was obtained by resolving the clauses corresponding to its parents.
In the resolution graph of Figure 2.2, the parents of clause {} are clauses {C}
and {¬C}. If there is a directed path from clause Ci to clause Cj in a resolution
graph, we say that Ci is an ancestor of clause Cj . In Figure 2.2, clause {¬C, D}
is an ancestor of clause {}. Some clauses in the knowledge base may not appear
in a resolution graph, indicating that the clauses were not used in the resolution
process. Resolution graphs are quite useful in describing resolution strategies,
which are discussed next.

2.2.2 Resolution strategies

The application of resolution can be very expensive. We usually have many


clauses whose resolvents are not helpful in deriving the empty clause, or in
deriving clauses that subsume the clausal form of the given query. This can
lead to a very large set of resolvents that takes too much time to generate and
requires too much space to store. A resolution strategy is a restriction on the
two clauses that can participate in a resolution step. The main purpose of
a resolution strategy is to reduce the resolution search space by limiting the
number of legal resolutions. Some resolution strategies are refutation complete,
therefore, reducing the required time/space resources without compromising the
completeness guarantee offered by general resolution. We shall consider four
resolution strategies next, three of which are refutation complete. Two of these
strategies, unit resolution and directed resolution, have uses beyond refutation
and will be discussed in some detail later.
10 Class Notes for CS264A, UCLA

2.2.3 Linear resolution


Linear resolution is a resolution strategy which resolves two clauses Ci and Cj
only if one of the clauses is a root in the resolution graph or is an ancestor of
the other clause. The resolution graph in Figure 2.2 is not constructed using
linear resolution: clauses {C} and {¬C} are resolved even though neither of the
them is a root, nor is one an ancestor of the other in the resolution graph. The
resolution graph depicted in Figure 2.3, however, is constructed according to
linear resolution. In the last resolution step, for example, neither clause {¬C}
nor clause {C} is a root (part of the initial knowledge base), but clause {C} is
an ancestor of clause {¬C}.
Linear resolution derives its name from the structure of resolution graphs
that it constructs. Specifically, a linear resolution graph contains a chain, in
which each clause has one parent that is immediately above it in the chain
(called the near parent), and another parent which is either an ancestor in the
chain or a root (called the far parent). The chain of Figure 2.3 is shown using
bold edges. The near parent of {¬C} is {D}, while its far parent is {¬C, ¬D}.
Linear resolution is refutation complete. Therefore, by considering only
linear resolution steps, one is guaranteed to derive the empty clause if the initial
knowledge base is inconsistent. Moreover, if the empty clause cannot be derived
using linear resolution, then the knowledge base must be consistent.

2.2.4 Set-of-support resolution


Given a clausal form ∆ and a subset Γ ⊆ ∆, we say that Γ is a set of support
for ∆ iff ∆ \ Γ is consistent. Intuitively, a set of support for a clausal form ∆
is a set of clauses in ∆ which are necessary for the inconsistency of ∆ (in case
∆ is inconsistent). Since our goal is usually to derive a contradiction if one
exists, the intuition is to insist on involving the set of support while attempting
to derive such a contradiction. Hence, set-of-support resolution is a resolution
strategy which requires that at least one of the resolved clauses must be in,
or a descendent of, some clause in the set of support. For this strategy to be
effective, however, the set of support must be as small as possible.
Consider the set of clauses ∆ in Figure 2.2. The set Γ = { {A}, {¬D} } is
a set of support for ∆ in this case, since ∆ \ Γ is consistent. Moreover, the
resolution graph depicted in this figure satisfies the conditions dictated by set-
of-support resolution, given the above choice of Γ. Note that Γ = { {A} } is also
a set of support for ∆ in this case. But given this choice of Γ, the resolution
graph in Figure 2.2 violates the conditions of set-of-support resolution, since
clause {¬C} is neither in the set of support, nor a descendent of that set.
One of the potential problems with set-of-support resolution is that of iden-
tifying a set of support. According to the definition, such identification would
generally require a consistency test which can be quite prohibitive. This is not
an issue, however, in the common case where one is using resolution to prove
that some sentence α is implied by a knowledge base ∆. In such a case, one
applies resolution to the clausal form of ∆ ∧ ¬α. It is quite common in this ap-
Adnan Darwiche c 2019 11

plication of resolution to assume that ∆ is consistent by construction, otherwise,


we would be trying to reason with an inconsistent knowledge base (a patholog-
ical case). Under such assumption, the clausal form of ¬α is guaranteed to be
a set of support for the clausal form of ∆ ∧ ¬α.
Set-of-support resolution is refutation complete.

2.3 Unit resolution


A unit clause is a clause that contains a single literal. Unit resolution is a
resolution strategy which requires that at least one of the resolved clauses be a
unit clause. Unit resolution is not refutation complete, yet it is one of the more
influential resolution strategies. This is due to a number of reasons. First, unit
resolution is refutation complete for an important class of clausal forms, known
as Horn clausal form, which we shall discuss later. Second, one can apply all
possible unit resolution steps in time linear in the size of given clausal form—
unit resolution is therefore very efficient. Third, since a unit clause implies a
specific truth value for the variable appearing in the clause, unit resolution can
be viewed as a simplification operation that uncovers the implications of setting
the value of a propositional variable to some value. This is why unit resolution is
usually used as a simplification step in many algorithms, after some unit clauses
are added to the knowledge base, or after the knowledge base is conditioned on
some literals.

2.3.1 Horn clausal form


A Horn clause is one which contains at most one positive literal. The clause
{¬A, ¬B, C} is therefore Horn, but the clause {¬A, B, C} is not. A Horn clausal
form is one which contains Horn clauses only. Unit resolution is refutation
complete for Horn clausal forms. Since all unit resolutions can be exhausted in
linear time, testing the consistency of Horn clausal forms can then be performed
in linear time. Note also that resolving two Horn clauses is guaranteed to yield
a Horn clause. Therefore, the class of Horn clauses is closed under resolution.
Given a Horn clausal form ∆ that is closed under unit resolution, and given
that ∆ does not contain the empty clause (∆ is consistent), one can obtain a
model ω of ∆ in time linear in the size of ∆ as follows:

1. For every unit clause C = {P } in ∆, set the value of variable P to true in


model ω, therefore, satisfying clause C.

2. For every unit clause C = {¬P } in ∆, set the value of variable P to false
in model ω, therefore, satisfying clause C.

3. For every clause C in ∆ which is not satisfied by the assignments of Steps 1


and 2, choose a negative literal ¬P from C, where P has not been assigned
12 Class Notes for CS264A, UCLA

a value, and set the value of P to false in model ω, therefore, satisfying


C.3

Another class which is closely connected to Horn clausal form is the renam-
able Horn clausal form. This is a clausal form which is not Horn, but can be
transformed into a Horn clausal form through a systematic process of renaming
variables. Consider the following clausal form:

∆ = { {P1 , . . . , Pn }, {¬P1 , Q}, . . . , {¬Pn , Q} },

which is not Horn since the first clause contains more than one positive literal.
Applying the following substitutions to the literals of ∆:

Pi → ¬Ri
¬Pi → Ri
Q → ¬S
¬Q → S,

leads to the following clausal form:

Γ = { {¬R1 , . . . , ¬Rn }, {R1 , ¬S}, . . . , {Rn , ¬S} },

which is Horn. It should be obvious that when clausal form Γ is obtained


from another clausal form ∆ using a similar systematic substitution, then ∆ is
satisfiable iff Γ is satisfiable. Moreover, a model of Γ can be easily transformed
into a model of ∆ by a reverse substitution. Hence, renamable Horn clausal
form is as tractable as Horn clausal form the viewpoint of satisfiability.

2.4 Directed resolution


Let P be a propositional variable. A P -resolvent is a clause which results
from resolving two clauses on the literals P and ¬P . For example, the clause
{A, C, ¬D} is a B-resolvent of clauses {A, ¬B} and {B, C, ¬D}.
Directed resolution with respect to a variable ordering P1 , . . . , Pn is a reso-
lution strategy which requires two conditions:

• Pj -resolvents should be generated only after all Pi -resolvents have been


generated, where i < j.

• The generation of a Pj -resolvent cannot involve any clause that mentions


a variable Pi , where i < j.
3 At least one such negative literal ¬P must exist. Otherwise, for all negative literals ¬P in

C, the value of P must be set to true (since C is not satisfied). This means that the knowledge
base must contain a unit clause {P } for each one of these literals, since only Step 1 can set
a variable value to true. But this means that unit resolution should have yielded the empty
clause {} if C contains no positive literals, or the unit clause {Q} if C contains the positive
literal Q. In either case, we have a contradiction with our assumptions.
Adnan Darwiche c 2019 13

Directed resolution is refutation complete for any variable ordering.


Consider the following knowledge base:
 

 ¬B ∨ ¬C ⇒ ¬A  
B ⇒ D
 
∆= ,

 C ⇒ ¬D  
¬A ∧ C ⇒ E
 

which has the following clausal form:

{ {¬A, B}, {¬A, C}, {¬B, D}, {¬C, ¬D}, {A, ¬C, E} }.

Let us now apply directed resolution using the order C, B, A, D, E. There are
two C-resolvents, but one of them is a tautology (it includes both A and ¬A), so
we are left only with {¬A, ¬D}. Moreover, there is one B-resolvent: {¬A, D}.
There are a number of A-resolvents but all involve variables that come before
A in the order; hence, they are not allowed by directed resolution. Similarly,
none of the D-resolvents are allowed by directed resolution. Finally, there are
no E-resolvents since variable E appears only positively in the clausal form.
Hence, directed resolution under the previous variable order generates only two
clauses. Since none of these clauses is the empty clause, and given that directed
resolution is refutation complete, we conclude that the original knowledge base
must be consistent.
The utility of directed resolution goes far beyond refutation, as it can be used
to forget variables and enumerate/count models. Moreover, the complexity of
directed resolution can be analyzed and bounded in terms of the structural
properties of given clausal form. Before we discuss these subjects, we introduce
a mechanism for applying directed resolution in a systematic fashion, which is
also quite useful in developing more insights into the method.
The mechanism is known as bucket elimination and proceeds in two stages,
in which we first construct and fill a set of buckets, and then process them.
Specifically, given a variable ordering π, we construct and fill buckets as follows:

• A bucket is constructed for each variable P and is labeled with variable P .

• Buckets are sorted top to bottom by their labels according to order π.

• Each clause α in the clausal form is added to the first Bucket P , such that
variable P appears in clause α.

Buckets are then processed from top to bottom. To process Bucket P , we


generate all P -resolvents using only clauses in Bucket P , and then add these
resolvents to corresponding buckets below Bucket P . That is, each resolvent α
is added to the first Bucket P 0 below Bucket P , such that variable P 0 appears
in resolvent α. The processing of a Bucket P is completed when all P -resolvents
have been generated and added to their corresponding buckets, and given that
the empty clause has not been generated in the process. Bucket elimination is
completed when all buckets have been processed. Otherwise, if the empty clause
14 Class Notes for CS264A, UCLA

is generated, we say that bucket elimination has been aborted. The set of clauses
that appear in all buckets after bucket elimination has been completed is called
the directed extension of the given clausal form and variable order.
Consider again the clausal form given above and the variable order C, B, A, D, E.
Constructing and filling buckets leads to:4
C : {¬A, C}, {¬C, ¬D}, {A, ¬C, E}
B : {¬A, B}, {¬B, D}
A:
D:
E:
Processing Bucket C adds one C-resolvent to Bucket A:
C : {¬A, C}, {¬C, ¬D}, {A, ¬C, E}
B : {¬A, B}, {¬B, D}
A: {¬A, ¬D}
D:
E:
Processing Bucket B adds one B-resolvent to Bucket A:
C : {¬A, C}, {¬C, ¬D}, {A, ¬C, E}
B : {¬A, B}, {¬B, D}
A: {¬A, ¬D}, {¬A, D}
D:
E:
Processing Buckets A, D and finally Bucket E leads to no new resolvents.
The main value of bucket elimination is that it allows us to easily enforce
the constraints imposed by directed resolution. In particular, when process-
ing a bucket, we are guaranteed that only clauses appearing in the bucket can
participate in generating resolvents. Moreover, we shall see later that the di-
rected extension of a clausal form, which is available once bucket elimination is
completed, can be used to extract valuable information about the given clausal
form.
The refutation completeness of directed resolution holds for any variable
order. The amount of work performed by directed resolution, however, is quite
dependent on the chosen order. We shall visit this issue at some length later, but
we illustrate it here with one example. Consider again the previous example
and suppose we choose the order E, A, B, C, D instead. Our buckets look as
follows initially:
E : {A, ¬C, E}
A : {¬A, B} {¬A, C}
B : {¬B, D}
C : {¬C, ¬D}
D:
4 It is not uncommon for buckets to be empty. It is also possible for all clauses to fall in

the same bucket.


Adnan Darwiche c 2019 15

Processing the above buckets yields no resolvents in this case! Hence, directed
resolution with respect to the above order generates no resolvents, even though
general resolution would generate a number of resolvents in this case.
If bucket elimination is completed (that is, the empty clause was not gen-
erated), we are guaranteed that the original clausal form is consistent. This
follows from the fact that bucket elimination simply implements directed res-
olution, and from the fact that directed resolution is refutation complete. It
turns out however that directed resolution is too powerful to be only used for
testing consistency as it can be used to forget variables, to count models, and
to systematically generate models. We discuss these applications next.

2.4.1 Forgetting
The use of directed resolution (and bucket elimination) in forgetting variables is
due to the following key result. Suppose that ∆ is a clausal form, and let Γ be
another clausal form which results from adding to ∆ all P -resolvents, and then
throwing out all clauses that mention P (hence, Γ does not mention variable
P ). It follows in this case that Γ is equivalent to ∃P ∆. Consider the following
clausal form,
∆ = { {¬A, B}, {¬B, C} }.
There is only one B-resolvent in this case: {¬A, C}. Adding this resolvent to
∆, and throwing out those clauses that mention B gives:

Γ = { {¬A, C} }.

This is equivalent to ∃B∆ which can be confirmed by computing ∆|B ∨ ∆|¬B.


Suppose now that we have a set of buckets corresponding to clausal form ∆
and variable order P1 , . . . , Pn . Given the above relationship between resolution
and forgetting, it is easy to show that once the first Bucket P1 is processed,
the remaining buckets must contain a clausal form which is equivalent to ∃P1 ∆.
Moreover, after processing Bucket P2 , Buckets P3 –Pn must contain a clausal
form equivalent to ∃P1 , P2 ∆. In general, after processing Buckets P1 –Pi , Buck-
ets Pi+1 –Pn must contain a clausal form equivalent to ∃P1 , . . . , Pi ∆.
Consider the knowledge base:

A ⇒ B
B ⇒ C
∆=
C ⇒ D
D ⇒ E,

and its directed extension:


C : {¬B, C} {¬C, D}
B : {¬A, B} {¬B, D}
D : {¬D, E} {¬A, D}
A: {¬A, E}
E:
16 Class Notes for CS264A, UCLA

The last two buckets contain the clause {¬A, E}, which is the result of forgetting
variables B, C, D from the original knowledge base ∆. That is, A⇒E is all that
knowledge base ∆ says about variables A, E in this case. The last bucket is
empty, which says that knowledge base ∆ says nothing about variable E alone.
Suppose now that the knowledge base ∆ is augmented with the unit clause
{D}. The directed extension of ∆ is then:

C : {¬B, C} {¬C, D}
B : {¬A, B} {¬B, D}
D : {¬D, E} {D} {¬A, D}
A: {¬A, E}
E: {E}.

The last bucket contains clause {E} in this case, which means that literal E is
all that the knowledge base ∆ ∪ {D} says about variable E.
For another example, consider the knowledge base:

¬B ∨ ¬C ⇒ ¬A
B ⇒ D
∆=
C ⇒ ¬D
¬A ∧ C ⇒ E,

and its directed extension:


A : {A, ¬C, E} {¬A, C} {¬A, B}
B : {¬B, D} {B, ¬C, E}
D : {¬C, ¬D} {¬C, D, E}
C: {¬C, E}
E:

The last two buckets contain the clause {¬C, E}, which means that C⇒E is all
that knowledge base ∆ says about variables C and E.

2.4.2 Model enumeration and counting


One way to characterize the models of a knowledge base is using a decision
tree. This is a rooted binary tree, in which each internal node is labelled with
a variable P and has two children: a high child which corresponds to literal P ,
and a low child corresponding to literal ¬P . Each leaf node in a decision tree is
labelled with either true or false; see Figure 2.4. We also require that for each
internal node N in the decision tree, there is at least one leaf node below N
which is labelled with true.
Consider now the following knowledge base:

¬B ∨ ¬C ⇒ ¬A
B ⇒ D
∆=
C ⇒ ¬D
¬A ∧ C ⇒ E,
Adnan Darwiche c 2019 17

false B

C C

false D D true

true false false E

true false

Figure 2.4: A decision tree. The high child of a node is shown on the left and
its low child is shown on the right.

and the decision tree in Figure 2.4. This decision tree characterizes the models
of ∆ in the following sense. Given any world ω, we can check whether ω is a
model of ∆ as follows. We start at the root, and for each node labelled with
variable P , we go to the high child if ω(P ) = true and to the low child if
ω(p) = false. If we end up at a leaf node labelled with true, we know that ω is
a model of ∆. If we end up at a leaf node labelled with false, we know that ω
is not a model of ∆.

The main property of a decision tree for a knowledge base ∆ is that the size
of the tree is O(|Mods(∆)|); hence, the size of the decision tree is bounded by
the number of models. Moreover, one can enumerate the models of ∆ in time
which is linear in the size of decision tree. One can also count these models in
time which is linear in the size of decision tree.

To enumerate and/or count the models characterized by a decision tree, we


observe two facts about each leaf node N which is labelled with true. First, the
node N corresponds to a term α which can be constructed by traversing the
path from the root to node N ; adding literal P to α when a node labelled with
P appears on the path together with its high child; and adding literal ¬P to α
when a node labelled with P appears on the path together with its low child.
For example, the leaf node at the extreme right of Figure 2.4 corresponds to the
term ¬A∧¬B ∧¬C. Suppose now that m variables appear in the term α, and let
n be the total number of variables. It then follows that the term α characterizes
exactly 2n−m models, which can be obtained by arbitrarily assigning values to
the n − m variables not appearing in α. For example, the leaf node we just
18 Class Notes for CS264A, UCLA

mentioned in Figure 2.4 characterizes the following four models:

¬A ∧ ¬B ∧ ¬C ∧ D ∧ E
¬A ∧ ¬B ∧ ¬C ∧ D ∧ ¬E
¬A ∧ ¬B ∧ ¬C ∧ ¬D ∧ E
¬A ∧ ¬B ∧ ¬C ∧ ¬D ∧ ¬E.

The other major thing to observe is that the models characterized by one leaf
node labelled with true are disjoint from the set of worlds characterized by
any other leaf node. Hence, we can easily count the total number of models
characterized by a decision tree by simply visiting each leaf node labelled with
true, while accumulating the number of models it characterizes. To enumerate
models, we follow the same procedure except that we explicitly enumerate the
models characterized by that leaf node. For example, the decision tree in Fig-
ure 2.4 characterizes 7 = 2 + 1 + 4 models, four of which have been enumerated
above.
As we shall see next, given a knowledge base ∆ and its directed extension,
we can build a decision tree that characterizes the models of ∆ in O(s1 s2 ) time,
where s1 is the size of the directed extension of ∆, and s2 is its number of models.
Specifically, suppose that Γ1 , . . . , Γn are the ordered buckets constituting the
directed extension of ∆, where variable Pi is the label of Bucket Γi . Then calling
Algorithm 1 on (Γ1 , . . . , Γn ; true) will return a decision tree for knowledge base
∆. For an example, consider the knowledge base,

¬B ∨ ¬C ⇒ ¬A,
B ⇒ D,
∆=
C ⇒ ¬D,
¬A ∧ C ⇒ E,

and its directed extension given order E, D, C, B, A:

ΓE : {A, ¬C, E}
ΓD : {¬B, D} {¬C, ¬D}
ΓC : {¬A, C} {¬B, ¬C}
ΓB : {¬A, B} {¬A, ¬B}
ΓA : {¬A}.

Calling Algorithm 1 on the ordered buckets (ΓE , ΓD , ΓC , ΓB , ΓA ; true) returns


the decision tree depicted in Figure 2.4.
Algorithm 1 is based on the following observation. Each recursive call to
the algorithm is passed a set of buckets Γi , . . . , Γj and a term α over vari-
ables Pj+1 , . . . , Pn . By construction, the term α satisfies the following property:
Γj+1 |α, . . . , Γn |α are all empty buckets since α is consistent with the clauses in
Buckets Γj+1 , . . . , Γn and mentions all variables in these clauses. Therefore,
given α, Bucket Γj contains all the constraints that Buckets Γj , . . . , Γn impose
on variable Pj . In fact, we have a stronger property: given α, Bucket Γj con-
tains all the constraints that Buckets Γ1 , . . . , Γn impose on variable Pj , since
Adnan Darwiche c 2019 19

Algorithm 1 generateDecisionTree(Ordered buckets: Γi , . . . , Γj ; Term: α):


Returns a decision tree which characterizes models of clausal form Γi |α ∪ . . . ∪
Γj |α. Bucket Γi is labelled with variable Pi .
1: if Γi |α, . . . , Γj |α are all empty then
2: return node labelled with true
3: if Γj |Pj ∧ α contains empty clause then
4: H ← node labelled with false
5: else
6: H ← generateDecisionTree(Γi , . . . , Γj−1 ; Pj ∧ α)
7: if Γj |¬Pj ∧ α contains empty clause then
8: L ← node labelled with false
9: else
10: L ← generateDecisionTree(Γi , . . . , Γj−1 ; ¬Pj ∧ α)
11: construct node N with label Pj , high child H, and low child L
12: return node N

Buckets Γj , . . . , Γn contain the projection of Buckets Γ1 , . . . , Γn on variables


Pj , . . . , Pn (see Section 2.4.1). Hence, to check whether there is a model which
is consistent with Pj ∧ α or with ¬Pj ∧ α, it is sufficient to check the consistency
of Bucket Γj with Pj ∧ α and with ¬Pj ∧ α, as is done on Lines 3 and 7. The
ability to perform this consistency check efficiently is what gives Algorithm 1
its linear time complexity.

2.5 The structural complexity of directed reso-


lution
The specific variable order used by directed resolution can have a dramatic effect
on the amount of work performed by the method. But one can derive an upper
bound on the amount of work performed under a given ordering by examining
the connectivity graph of given clausal form. Such structural analysis not only
allows us to bound the time and space complexity of directed resolution, but it
also provides insights into how to choose good variable orderings. We will treat
all these subjects in this section.
The complexity analysis we shall conduct next relies on the following ob-
servation. If a bucket has ≤ k variables, then that bucket has O(3k ) clauses,
which is a tight bound on the number of clauses that one can express using ≤ k
variables. In such a case, the space complexity of directed resolution is O(n3k ),
where n is the number of buckets. Moreover, processing a bucket with O(3k )
2
clauses takes O(3k ) time. Hence, we can get an upper bound on the time and
space complexity of directed resolution, once we are able to establish an upper
bound on the number of variables that can ever appear in the same bucket,
during the process of directed resolution.
To establish such a bound on the number of variables in a bucket, we will
20 Class Notes for CS264A, UCLA

B B

A D D

E C E C

GA GB

D D

E C E E

GC GD GE

Figure 2.5: Connectivity graphs corresponding to different stages of directed


resolution.

appeal to the connectivity graph of a clausal form. We will first consider an


example, and then use it to draw some general principles. Consider the directed
extension:
A : {A, ¬C, E}, {¬A, C}, {¬A, B}
B : {¬B, D} {B, ¬C, E}
C : {¬C, ¬D} {¬C, D, E}
D:
E:

Note that the original clausal form is shown on the left, with clauses added
by directed resolution shown on the right. Consider now the sequence of con-
nectivity graphs depicted in Figure 2.5. The first graph, GA , corresponds to
clauses in Buckets A–E, just before processing Bucket A. The second graph,
GB , corresponds to clauses in Buckets B–E, just before processing Bucket B.
The third graph, GC , corresponds to clauses in Buckets C–E, just before pro-
cessing Bucket C. And so on. A number of observations are in order about
these successive connectivity graphs:
• Some of the graphs contains dotted edges. These edges correspond to
clauses that were added while applying directed resolutions.
• The neighbors of variable A in graph GA are exactly the variables appear-
Adnan Darwiche c 2019 21

ing in Bucket A, aside from variable A, when that bucket is about to be


processed. This is indeed true for any other bucket and its corresponding
connectivity graph.
• The transition from graph GA to GB can be described as a process of
connecting the neighbors of A in GA , and then removing variable A from
GA . The transition from graph GB to graph GC , and the other remaining
transitions, can be described similarly.
In general, however, only a subset of the neighbors may be connected during a
transition. The particular subset of neighbors that get connected depends on
the clauses that appear in the bucket whose processing triggers the transition.
In the worse case though, all neighbors may end up being connected.

2.5.1 Bounding the size of buckets


If one has access to the successive connectivity graphs GP1 , . . . , GPn correspond-
ing to directed resolution using variable order P1 , . . . , Pn , then one can easily
identify variables that are present in Bucket Pi , when that bucket is about to be
processed. Specifically, the variables in Bucket Pi , aside from Pi , are exactly the
neighbors of Pi in connectivity graph GPi . Obtaining these successive connec-
tivity graphs, however, requires that we execute directed resolution, which can
be prohibitive. Therefore, instead of generating graphs GP1 , . . . , GPn , we will
generate another set of graphs G0P1 , . . . , G0Pn , where each G0Pi is a supergraph
of GPi . In this case, the number of neighbors of Pi in G0Pi represents an upper
bound on the number of its neighbors in GPi , leading us to an upper bound
on the number of variables in each bucket, instead of an exact count of these
variables.
We can obtain such supergraphs by assuming a worst case scenario. That is,
when processing Bucket Pi , we assume that enough resolvents will be generated
so that every pair of variables in Bucket Pi will end up appearing in some
resolvent. This means that the next connectivity graph G0Pi+1 can be generated
from the current one, G0Pi , by pair-wise connecting the neighbors of Pi in G0Pi ,
and then removing variable Pi from G0Pi . This does not require the execution of
directed resolution. In fact, it does not even require knowledge of the original
clausal form. The whole sequence of graphs G0P1 , . . . , G0Pn can be generated once
we know the initial graph G0P1 = GP1 and the variable order P1 , . . . , Pn .
The above worst-case analysis leads to the following guarantee. If the number
of neighbors that variable Pi has in graph G0Pi is k, then the number of variables
that appear in Bucket Pi must be no more than k + 1.

2.5.2 Treewidth
The above complexity analysis of directed resolution can be made more formal
using the notion of treewidth. Let G be a graph, and let P1 , . . . , Pn be an
ordering of nodes in G. Suppose that we eliminate variables P1 , . . . , Pn from
G in that order, where eliminating variable Pi is accomplished by pair-wise
22 Class Notes for CS264A, UCLA

connecting all neighbors of Pi in the graph, and then removing Pi from the
graph. Let di represent the number of neighbors that variable Pi has before it
is eliminated. The width of variable order P1 , . . . , Pn with respect to graph G is
then defined as maxni=1 di . Moreover, the treewidth of graph G is defined as the
smallest width attained by any variable order with respect to G.
Therefore, given a clausal form ∆ with connectivity graph G, and given a
variable ordering π which has width w with respect to G, the space complexity
of directed resolution using order π is O(n3w+1 ), and its time complexity is
2
O(n(3w+1 ) ), where n is the number of variables in ∆. Note that the quality
of these upper bounds depend on the width w of order π: they can be useless
when w is large enough. Moreover, the smallest that these bounds can get is
when the width of order π is the same as the treewidth of graph G. That is,
when π is an optimal order with respect to graph G.
It is important to note that the width of an order can be high, even reaching
n, yet directed resolution can be very efficient. Consider for example the knowl-
edge base: P1 ⇒P2 , P1 ⇒P3 , . . . , P1 ⇒Pn . The width of order π = P1 , P2 , . . . , Pn
is n − 1 in this case. Therefore, our bounds above are O(n3n ) for space and
2
O(n(3n ) ) for time, which are useless. Note, however, that applying directed
resolution using order π leads to adding no resolvents whatsoever. This is
an example where the worst-case analysis underlying the derivation of these
bounds is way different from the actual scenario. Note also here that order
π = P2 , P3 , . . . , Pn , P1 has width 1. Therefore, our bounds in this case become
2
O(n32 ) for space and O(n(32 ) ) for time.
The above analysis not only allows us to bound the computational resources
of directed resolution under different variable orders, but also suggests ways
to generate good orders based on the given connectivity graph. Basically, we
want to generate an order whose width is as small as possible, as that allows
us to establish the tightest bound on the time and space complexity of directed
resolution. Generating an optimal variable order (one with minimal width) for
arbitrary graphs is known to be NP-hard. Yet, there are efficient procedures
for a variety of graph classes, including trees and graphs that have a bounded
treewidth. It is also more common to use heuristic methods for generating vari-
able ordering, instead of using optimal methods. One of the simplest heuristics
is the min-degree heuristic, which generates a variable ordering by choosing the
variable that has the least number of neighbors in the connectivity graph. Once
a variable is chosen, it is eliminated from the graph after pair-wise connect-
ing all its neighbors, and the next variable is chosen using a similar procedure.
Consider the connectivity graph in Figure 2.1 for an example. The min-degree
heuristic would generate the variable ordering B, D, A, C, E, assuming that ties
are broken by preferring variables that are earlier in the alphabet.
Another related heuristic, which is generally more effective, is to choose the
variable whose elimination leads to adding the smallest number of new edges to
the connectivity graph. This heuristic is known as the min-fill heuristic.

You might also like