Knowledge Engineering PDF
Knowledge Engineering PDF
GHEORGHE TECUCI
George Mason University
DORIN MARCU
George Mason University
MIHAI BOICU
George Mason University
DAVID A. SCHUM
George Mason University
One Liberty Plaza, New York, NY 10006
www.cambridge.org
Information on this title: www.cambridge.org/9781107122567
© Gheorghe Tecuci, Dorin Marcu, Mihai Boicu, and David A. Schum 2016
A catalog record for this publication is available from the British Library.
Library of Congress Cataloging-in-Publication Data
Names: Tecuci, Gheorghe, author. | Marcu, Dorin, author. | Boicu, Mihai, author. |
Schum, David A., author.
Title: Knowledge engineering: building cognitive assistants for evidence-based reasoning /
Gheorghe Tecuci, George Mason University, Dorin Marcu, George Mason University,
Mihai Boicu, George Mason University, David A. Schum, George Mason University.
Description: New York NY : Cambridge University Press, 2016. | Includes bibliographical references and index.
Identifiers: LCCN 2015042941 | ISBN 9781107122567 (Hardback : alk. paper)
Subjects: LCSH: Expert systems (Computer science) | Intelligent agents (Computer software) | Machine learning |
Artificial intelligence | Knowledge, Theory of–Data processing.
Classification: LCC QA76.76.E95 T435 2016 | DDC 006.3/3–dc23 LC record available at
http://lccn.loc.gov/2015042941
Preface page xv
Acknowledgments xxi
About the Authors xxiii
1 Introduction . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Understanding the World through Evidence-based
Reasoning 1
1.1.1 What Is Evidence? 1
1.1.2 Evidence, Data, and Information 1
1.1.3 Evidence and Fact 2
1.1.4 Evidence and Knowledge 2
1.1.5 Ubiquity of Evidence 5
1.2 Abductive Reasoning 5
1.2.1 From Aristotle to Peirce 5
1.2.2 Peirce and Sherlock Holmes on Abductive
Reasoning 6
1.3 Probabilistic Reasoning 9
1.3.1 Enumerative Probabilities: Obtained by Counting 9
1.3.1.1 Aleatory Probability 9
1.3.1.2 Relative Frequency and Statistics 9
1.3.2 Subjective Bayesian View of Probability 11
1.3.3 Belief Functions 13
1.3.4 Baconian Probability 16
1.3.4.1 Variative and Eliminative Inferences 16
1.3.4.2 Importance of Evidential Completeness 17
1.3.4.3 Baconian Probability of Boolean
Expressions 20
1.3.5 Fuzzy Probability 20
1.3.5.1 Fuzzy Force of Evidence 20
1.3.5.2 Fuzzy Probability of Boolean Expressions 21
1.3.5.3 On Verbal Assessments of Probabilities 22
1.3.6 A Summary of Uncertainty Methods and What
They Best Capture 23
1.4 Evidence-based Reasoning 25
1.4.1 Deduction, Induction, and Abduction 25
1.4.2 The Search for Knowledge 26
1.4.3 Evidence-based Reasoning Everywhere 27
13:46:00,
vi Contents
13:46:00,
Contents vii
13:46:00,
viii Contents
5 Ontologies . . . . . . . . . . . . . . . . . . . . . . 155
5.1 What Is an Ontology? 155
5.2 Concepts and Instances 156
5.3 Generalization Hierarchies 157
5.4 Object Features 158
5.5 Defining Features 158
5.6 Representation of N-ary Features 160
5.7 Transitivity 161
5.8 Inheritance 162
5.8.1 Default Inheritance 162
5.8.2 Multiple Inheritance 162
5.9 Concepts as Feature Values 163
5.10 Ontology Matching 164
5.11 Hands On: Browsing an Ontology 165
5.12 Project Assignment 4 168
5.13 Review Questions 168
13:46:00,
Contents ix
13:46:00,
x Contents
13:46:00,
Contents xi
13:46:00,
xii Contents
13:46:00,
Contents xiii
References 433
Appendixes 443
Summary: Knowledge Engineering Guidelines 443
Summary: Operations with Disciple-EBR 444
Summary: Hands-On Exercises 446
Index 447
13:46:00,
13:46:00,
Preface
BOOK PURPOSE
xv
13:47:03,
.001
xvi Preface
BOOK CONTENTS
Here is a route or map we will follow in the learning venture you will have
with the assistance of Disciple-EBR. Chapter 1 is a general introduction to
the topics that form the basis of this book. It starts with the problem
of understanding the world through evidence-based reasoning. It then
presents abductive reasoning, five different conceptions of probability
(enumerative, subjective Bayesian, Belief Functions, Baconian, and Fuzzy),
and how deductive, abductive, and inductive (probabilistic) reasoning are
used in evidence-based reasoning. After that, it introduces artificial intelli-
gence and intelligent agents, and the challenges of developing such agents
through conventional knowledge engineering. Afterward, it introduces the
development of agents through teaching and learning, which is the
approach presented in this book.
Chapter 2 is an overview of evidence-based reasoning, which is a focus
of this book. It starts with a discussion of the elements that make evidence-
based reasoning an astonishingly complex task. It then introduces a sys-
tematic approach that integrates abduction, deduction, and induction to
solve a typical evidence-based reasoning task, using intelligence analysis as
an example. Finally, it shows the application of the same approach to other
13:47:03,
.001
Preface xvii
13:47:03,
.001
xviii Preface
13:47:03,
.001
Preface xix
BACKGROUND
xix
13:47:03,
.001
xx Preface
13:47:03,
.001
Acknowledgments
xxi
13:47:10,
13:47:10,
About the Authors
xxiii
13:47:11,
xxiv About the Authors
13:47:11,
1 Introduction
We can try to understand the world in various ways, an obvious one being the employment of
empirical methods for gathering and analyzing various forms of evidence about phenomena,
events, and situations of interest to us. This will include work in all of the sciences, medicine,
law, intelligence analysis, history, political affairs, current events, and a variety of other contexts
too numerous to mention. In the sciences, this empirical work will involve both experimental
and nonexperimental methods. In some of these contexts, notably in the sciences, we are able
to devise mathematical and logical models that allow us to make inferences and predictions
about complex matters of interest to us. But in every case, our understanding rests on our
knowledge of the properties, uses, discovery, and marshaling of evidence. This is why we begin
this book with a careful consideration of reasoning based on evidence.
13:51:08,
.002
2 Chapter 1. Introduction
13:51:08,
.002
1.1. Evidence-based Reasoning 3
what we believe it may be, and how we obtain it. Two questions we would normally ask
regarding what Bob just told us are as follows:
Does Bob really know what he just told us, that the Ford car did not stop at the red light
signal?
Do we ourselves then also know, based on Bob’s testimony, that the Ford car did not
stop at the red light signal?
Let’s consider the first question. For more than two millennia, some very learned people
have troubled over the question: What do we mean when we say that person A knows that
event B occurred? To apply this question to our source Bob, let’s make an assumption that
will simplify our answering this question. Let’s assume that Bob is a competent observer in
this matter. Suppose we have evidence that Bob was actually himself at the intersection
when the accident happened. This is a major element of Bob’s competence. Bob’s
credibility depends on different matters, as we will see.
Here is what a standard or conventional account says about whether Bob knows
that the car did not stop at the red light signal. First, here is a general statement of the
standard account of knowledge: Knowledge is justified true belief. Person knows that
event B occurred if:
This standard analysis first says that event B must have occurred for to have knowledge
of its occurrence. This is what makes ’s belief true. If B did not occur, then could not
know that it occurred. Second, ’s getting nondefective evidence that B occurred is
actually where ’s competence arises. could not have gotten any evidence, defective
or nondefective, if was not where B could have occurred. Then, believed the evidence
received about the occurrence of event B, and was justified in having this belief by
obtaining nondefective evidence of B’s occurrence.
So, in the case involving Bob’s evidence, Bob knows that the Ford car did not stop at the
red light signal if:
If all of these three things are true, we can state on this standard analysis that Bob knows
that the Ford car did not stop at the red light signal.
Before we proceed, we must acknowledge that this standard analysis has been very
controversial in fairly recent years and some philosophers claim to have found alleged
paradoxes and counterexamples associated with it. Other philosophers dispute these claims.
Most of the controversy here concerns the justification condition: What does it mean to say
that A is justified in believing that B occurred? In any case, we have found this standard
analysis very useful as a heuristic in our analyses of the credibility of testimonial evidence.
But now we have several matters to consider in answering the second question: Do we
ourselves also know, based on Bob’s testimony, that the Ford car did not stop at the red
light signal? The first and most obvious fact is that we do not know the extent to which any
of the three events just described in the standard analysis are true. We cannot get inside
13:51:08,
.002
4 Chapter 1. Introduction
Bob’s head to obtain necessary answers about these events. Starting at the bottom, we do
not know for sure that Bob believes what he just told us about the Ford car not stopping at
the red light signal. This is a matter of Bob’s veracity or truthfulness. We would not say that
Bob is being truthful if he told us something he did not believe.
Second, we do not know what sensory evidence Bob obtained on which to base his
belief and whether he based his belief at all on this evidence. Bob might have believed that
the Ford car did not stop at the red light signal either because he expected or desired this
to be true. This involves Bob’s objectivity as an observer. We would not say that Bob was
objective in this observation if he did not base his belief on the sensory evidence he
obtained in his observation.
Finally, even if we believe that Bob was an objective observer who based his belief
about the accident on sensory evidence, we do not know how good this evidence was.
Here we are obliged to consider Bob’s sensory sensitivities or accuracy in the conditions
under which Bob made his observations. Here we consider such obvious things as Bob’s
visual acuity. But there are many other considerations, such as, “Did Bob only get a
fleeting look at the accident when it happened?” “Is Bob color-blind?“ “Did he make this
observation during a storm?” and, “What time of day did he make this observation?” For a
variety of such reasons, Bob might simply have been mistaken in his observation: The light
signal was not red when the Ford car entered the intersection.
So, what it comes down to is that the extent of our knowledge about whether the Ford
car did not stop at the red light signal, based on Bob’s evidence, depends on these three
attributes of Bob’s credibility: his veracity, objectivity, and observational sensitivity. We
will have much more to say about assessing the credibility of sources of evidence, and how
Disciple-EBR can assist you in this difficult process, in Section 4.7 of this book.
Now, we return to our role as police investigators. Based on evidence we have about
Bob’s competence and credibility, suppose we believe that the event he reports did occur;
we believe that “E: The Ford car did not stop at the red light signal,” did occur. Now we
face the question: So what? Why is knowledge of event E of importance to us? Stated more
precisely: How is event E relevant in further inferences we must make? In our investigation
so far, we have other evidence besides Bob’s testimony. In particular, we observe a Toyota
car that has smashed into a light pole at this intersection, injuring the driver of the Toyota,
who was immediately taken to a hospital. In our minds, we form the tentative chain of
reasoning from Figure 1.1.
This sequence of events, E ➔ F ➔ G ➔ H, is a relevance argument or chain of reasoning
whose links represent sources of doubt interposed between the evidence E* and the
hypothesis H. An important thing to note is that some or all of these events may not be
true. Reducing our doubts or uncertainties regarding any of these events requires a variety
H: The driver of the Ford car bears the responsibility for the accident.
G: The driver of the Toyota swerved to avoid the Ford car and smashed into a light pole.
F: The driver of the Toyota car, having a green light at this intersection, saw the Ford car running the red light.
E: The Ford car did not stop at the red light signal at this intersection.
E*: Bob’s testimony that the Ford car did not stop at the red light signal at this intersection.
13:51:08,
.002
1.2. Abductive Reasoning 5
of additional evidence. The extent of our knowledge about the relative probabilities of our
final hypothesis depends on the believability of our evidence and on the defensibility
and strength of our relevance arguments, as discussed in Section 2.2. The whole point here
is that the relation between evidence and knowledge is not a simple one at all.
The word “evidence” is associated more often with lawyers and judicial trials
than with any other cross-section of society or form of activity. . . . In its
simplest sense, evidence may be defined as any factual datum which in some
manner assists in drawing conclusions, either favorable or unfavorable, to
some hypothesis whose proof or refutation is being attempted.
Murphy notes that this term is appropriate in any field in which conclusions are reached
from any relevant datum. Thus, physicians, scientists of any ilk, historians, and persons of
any other conceivable discipline, as well as ordinary persons, use evidence every day in
order to draw conclusions about matters of interest to them.
We believe there is a very good reason why many persons are so often tempted to
associate the term evidence only with the field of law. It happens that the Anglo-American
system of laws has provided us with by far the richest legacy of experience and scholarship
on evidence of any field known to us. This legacy has arisen as a result of the development
of the adversarial system for settling disputes and the gradual emergence of the jury
system, in which members of the jury deliberate on evidence provided by external
witnesses. This legacy has now been accumulating over at least the past six hundred years
(Anderson et al., 2005).
Evidence-based reasoning involves abductive, deductive, and inductive (probabilistic)
reasoning. The following sections briefly introduce them.
13:51:08,
.002
6 Chapter 1. Introduction
Possible Hypotheses
or Explanations
Hypothesis
Generation
Hypothesis
Testing
? Deduction
Aristotle (384 bc–322 bc) was the first to puzzle about the generation or discovery of new
ideas in science. From sensory observations, we generate possible explanations, in the
form of hypotheses, for these observations. It was never clear from Aristotle's work what
label should be placed on the upward, or discovery-related, arm of the arch in Figure 1.2.
By some accounts, Aristotle’s description of this act of generating hypotheses is called
"intuitive induction" (Cohen and Nagel, 1934; Kneale, 1949). The question mark on the
upward arm of the arch in Figure 1.2 simply indicates that there is still argument about
what this discovery-related arm should be called. By most accounts, the downward arm of
the arch concerns the deduction of new observable phenomena, assuming the truth of a
generated hypothesis (Schum, 2001b).
Over the millennia since Aristotle, many people have tried to give an account of the
process of discovering hypotheses and how this process differs from ones in which existing
hypotheses are justified. Galileo Galilei (1564–1642) thought that we “reason backward”
inductively to imagine causes (hypotheses) from observed events, and we reason deduct-
ively to test the hypotheses. A similar view was held by Isaac Newton (1642–1727), John
Locke (1632–1704), and William Whewell (1794–1866). Charles S. Peirce (1839–1914) was
the first to suggest that new ideas or hypotheses are generated through a different form of
reasoning, which he called abduction and associated with imaginative reasoning (Peirce,
1898; 1901). His views are very similar to those of Sherlock Holmes, the famous fictional
character of Conan Doyle (Schum, 1999).
As an illustration, let us assume that we observe E*: “Smoke in the East building” (E* being
evidence that event E occurred).
13:51:08,
.002
1.2. Abductive Reasoning 7
Based on our prior knowledge of contexts in which things like E: “Smoke in a building”
have occurred, we say: “Whenever something like H: ‘There is fire in a building’ has
occurred, then something like E: ‘Smoke in the building’ has also occurred.” Thus, there is
reason to suspect that H: “There is fire in the East building” may explain the occurrence of
the clue E*: “Smoke in the East building.” In other words, the clue E* points to H as a
possible explanation for its occurrence.
To summarize:
Peirce was unsure about what to call this form of reasoning. At various points in his work,
he called it “abduction,” “retroduction,” and even just “hypothesis” (Pierce, 1898; 1901).
The essential interpretation Peirce placed on the concept of abduction is illustrated in
Figure 1.3. He often used as a basis for his discussions of abduction the observation of an
anomaly in science. Let us suppose that we already have a collection of prior evidence in
some investigation and an existing collection of hypotheses H1, H2, . . . , Hn. To varying
degrees, these n hypotheses explain the evidence we have so far. But now we make an
observation E* that is embarrassing in the following way: We take E* seriously, but we cannot
explain it by any of the hypotheses we have generated so far. In other words, E* is an
anomaly. Vexed by this anomaly, we try to find an explanation for it. In some cases, often
much later when we are occupied by other things, we experience a “flash of insight” in which
it occurs to us that a new hypothesis Hn+1 could explain this anomaly E*. It is these “flashes of
insight” that Peirce associated with abduction. Asked at this moment to say exactly how Hn+1
explains E*, we may be unable to do so. However, further thought may produce a chain of
reasoning that plausibly connects Hn+1 and E*. The reasoning might go as follows:
It is possible, of course, that the chain of reasoning might have started at the top with Hn+1
and ended at E*. This is why we have shown no direction on the links between E* and Hn+1
in Figure 1.3.
13:51:08,
.002
8 Chapter 1. Introduction
But our discovery-related activities are hardly over just because we have explained
this anomaly. Our new hypothesis Hn+1 would not be very appealing if it explained
only anomaly E*. Figure 1.4 shows the next steps in our use of this new hypothesis.
We first inquire about the extent to which it explains the prior evidence we collected
before we observed E*. An important test of the suitability of the new hypothesis Hn+1
involves asking how well this new hypothesis explains other observations we have
taken seriously. This new hypothesis would be especially valuable if it explains our
prior evidence better than any of our previously generated hypotheses. But there is
one other most important test of the adequacy of a new hypothesis Hn+1: How well does
this new hypothesis suggest new potentially observable evidence that our previous
hypotheses did not suggest? If Hn+1 would be true, then B, I, and K would also be true;
and if B would be true, then C would be true. Now if C would be true, then we would
need to observe D.
In the illustrations Peirce used, which are shown in Figures 1.3 and 1.4, we entered
the process of discovery at an intermediate point when we already had existing hypoth-
eses and evidence. In other contexts, we must of course consider abductive reasoning
from the beginning of an episode of fact investigation when we have no hypotheses and
no evidence bearing on them. Based on our initial observations, by this process of
abductive or insightful reasoning, we may generate initial guesses or hypotheses to
explain even the very first observations we make. Such hypotheses may of course be
vague, imprecise, or undifferentiated. Further observations and evidence we collect may
allow us to make an initial hypothesis more precise and may of course suggest entirely
new hypotheses.
It happens that at the very same time Peirce was writing about abductive reasoning,
insight, and discovery, across the Atlantic, Arthur Conan Doyle was exercising his
fictional character Sherlock Holmes in many mystery stories. At several points in the
Sherlock Holmes stories, Holmes describes to his colleague, Dr. Watson, his inferential
strategies during investigation. These strategies seem almost identical to the concept of
abductive reasoning described by Peirce. Holmes did not, of course, describe his investi-
gative reasoning as abductive. Instead, he said his reasoning was “backward," moving
from his observations to possible explanations for them. A very informative and enjoy-
able collection of papers on the connection between Peirce and Sherlock Holmes
appears in the work of Umberto Eco and Thomas Sebeok (1983). In spite of the similarity
of Peirce's and Holmes's (Conan Doyle's) views of discovery-related reasoning, there is
no evidence that Peirce and Conan Doyle ever shared ideas on the subject.
G
B I K
F
C J L
E How well does
D Observable
Hn+1 explain
E* prior evidence How well does Hn+1 suggest new
taken seriously kinds of observable evidence?
Figure 1.4. Putting an abduced hypothesis to work.
13:51:08,
.002
1.3. Probabilistic Reasoning 9
A major trouble we all face in thinking about probability and uncertainty concerns the fact
that the necessity for probability calculations, estimations, or judgments arises in different
situations. In addition, there are many different attributes of our judgments that we would
like to capture in assessments of uncertainty we are obliged to make. There are situations
in which you can estimate probabilities of interest by counting things. But there are many
other situations in which we have uncertainty but will have nothing to count. These
situations involve events that are singular, unique, or one of a kind. In the following, we
will briefly discuss several alternative views of probability, starting with two views of
probability that involve processes in which we can obtain probabilities or estimates of
them by enumerative or counting processes.
For example, in a game involving a pair of fair six-sided dice, where we roll and add the
two numbers showing up, there are thirty-six ways in which the numbers showing up will
have sums between two and twelve, inclusive. So, in this case, n(S) = 36. Suppose you wish
to determine the probability that you will roll a seven on a single throw of these dice. There
are exactly six ways in which this can happen. If E = “the sum of the numbers is seven,”
then n(E) = 6. The probability of E, P(E), is simply determined by dividing n(E) by n(S),
which in this example is P(E) = 6/36 = 1/6. So, aleatory probabilities are always determined
by dividing n(E) by n(S), whatever E and S are, as long as E is a subset of S.
13:51:08,
.002
10 Chapter 1. Introduction
estimate of the true probability of E, P(E). The reason, of course, is that the number N of
observations we have made is always less than the total number of observations that could
be made. In some cases, there may be an infinite number of possible observations. If you
have had a course in probability theory, you will remember that there are several formal
statements, called the laws of large numbers, for showing how f(E) approaches P(E) when
N is made larger and larger.
Probability theory presents an interesting paradox. It has a very long history but a very
short past. There is abundant evidence that people as far back as Paleolithic times used
objects resembling dice either for gambling or, more likely, to foretell the future (David,
1962). But attempts to calculate probabilities date back only to the 1600s, and the first
attempt to develop a theory of mathematical probability dates back only to 1933 in the
work of A. N. Kolmogorov (1933). Kolmogorov was the first to put probability on an
axiomatic basis. The three basic axioms he proposed are the following ones:
All Axiom 1 says is that probabilities are never negative. Axioms 1 and 2, taken together,
mean that probabilities are numbers between 0 and 1. An event having 0 probability is
commonly called an “impossible event.” Axiom 3 is called the additivity axiom, and it
holds for any number of mutually exclusive events.
Certain transformations of Kolmogorov’s probabilities are entirely permissible and are
often used. One common form involves odds. The odds of event E occurring to its not
occurring, which we label Odds(E, ¬E), is determined by Odds(E, ¬E) = P(E)/(1 – P(E)). For
any two mutually exclusive events E and F, the odds of E to F, Odds(E, F), are given by Odds
(E, F) = P(E)/P(F). Numerical odds scales range from zero to an unlimited upper value.
What is very interesting, but not always recognized, is that Kolmogorov had only
enumerative probability in mind when he settled on the preceding three axioms. He
makes this clear in his 1933 book and in his later writings (Kolmogorov, 1969). It is easily
shown that both aleatory probabilities and relative frequencies obey these three axioms.
But Kolmogorov went an important step further in defining conditional probabilities that
are necessary to show how the probability of an event may change as we learn new
information. He defined the probability of event E, given or conditional upon some other
event F, as P(E given F) = P(E and F)/P(F), assuming that P(F) is not zero. P(E given F) is
also written as P(E|F). He chose this particular definition since conditional probabilities, so
defined, will also obey the three axioms just mentioned. In other words, we do not need
any new axioms for conditional probabilities.
Now comes a very important concept you may have heard about. It is called Bayes’
rule and results directly from applying the definition of the conditional probability. From
P(E* and H) = P(H and E*), you obtain P(E*|H) P(H) = P(H|E*)P(E*). This can then be
written as shown in Figure 1.5.
This rule is named after the English clergyman, the Reverend Thomas Bayes (1702–1761),
who first saw the essentials of a rule for revising probabilities of hypotheses, based on new
evidence (Dale, 2003). He had written a paper describing his derivation and use of this rule
but he never published it; this paper was found in his desk after he died in 1761 by Richard
13:51:08,
.002
1.3. Probabilistic Reasoning 11
Probability of H given
(Posterior)
Prior probability of e
(Normalizer)
Figure 1.5. The Bayes’ rule.
Price, the executor of Bayes’ will. Price realized the importance of Bayes’ paper and
recommended it for publication in the Transactions of the Royal Society, in which it
appeared in 1763. He rightly viewed Bayes’ rule as the first canon or rule for inductive or
probabilistic reasoning. Bayes’ rule follows directly from Kolmogorov’s three axioms and his
definition of a conditional probability, and is entirely uncontroversial as far as its derivation
is concerned. But this rule has always been a source of controversy on other grounds. The
reason is that it requires us to say how probable a hypothesis is before we have gathered
evidence that will possibly allow us to revise this probability. In short, we need prior
probabilities on hypotheses in order to revise them, when they become posterior
probabilities. Persons wedded to enumerative conceptions of probability say we can never
have prior probabilities of hypotheses, since in advance of data collection we have nothing
to count. Statisticians are still divided today about whether it makes sense to use Bayes’ rule
in statistical inferences. Some statisticians argue that initial prior probabilities could be
assessed only subjectively and that any subjective assessments have no place in any area
that calls itself scientific. Bayes’ rule says that if we are to talk about probability revisions in
our beliefs, based on evidence, we have to say where these beliefs were before we obtained
the evidence.
The Bayes’ rule is useful in practice because there are many cases where we have good
probability estimates for three of the four probabilities involved, and we can therefore
compute the fourth one (see, e.g., Question 1.9).
It is time for us to consider views of probability in situations where we will have nothing
to count, either a priori or anywhere else: the Subjective Bayesian view, Belief Functions,
Baconian probabilities, and Fuzzy probabilities. We provide a look at only the essentials of
these four views, focusing on what each one has to tell us about what the force or weight of
evidence on some hypothesis means. More extensive comparisons of these four views
appear in (Schum, 1994 [2001a], pp. 200–269).
13:51:08,
.002
12 Chapter 1. Introduction
previously noted. Since Bayes’ rule rests on these axioms and definition, we must adhere to
them in order to say that our assessment process is coherent or consistent.
As we will show, the likelihoods and their ratios are the ingredients of Bayes’ rule that
concern the inferential force of evidence. Suppose we have two hypotheses, H and ¬H,
and a single item of evidence, E , saying that event E occurred. What we are interested in
determining are the posterior probabilities: PðHjE Þ and P ð¬HjE Þ. Using the Bayes’ rule
from Figure 1.5, we can express these posterior probabilities as:
PðE jH ÞPðH Þ
PðHjE Þ ¼
P ðE Þ
The next step is to divide PðHjE Þ by Pð¬HjE Þ, which will produce three ratios; in the
process the term PðE Þ will drop out. Here are the three ratios that result:
PðHjE Þ PðH Þ P ðE jH Þ
¼
Pð¬HjE Þ Pð¬H Þ P ðE j¬H Þ
P ðHjE Þ
The left-hand ratio P ð¬HjE Þ is called the posterior odds of H to ¬H, given evidence E . In
symbols, we can express this ratio as OddsðH : ¬HjE Þ. The first ratio on the right, PPðð¬H
HÞ
Þ, is
called the prior odds of H to ¬H. In symbols, we can express this ratio as OddsðH : ¬H Þ.
The remaining ratio on the right, PPððEE j¬H
jH Þ
Þ, is called the likelihood ratio for evidence E ; we
give this ratio the symbol LE . In terms of these three ratios, Bayes' rule applied to this
situation can be written simply as follows:
This simple version of Bayes' rule is called the odds-likelihood ratio form. It is also called,
somewhat unkindly, “idiot’s Bayes.” If we divide both sides of this equation by the prior
odds, OddsðH : ¬H Þ, we observe that the likelihood ratio LE is simply the ratio of posterior
odds to prior odds of H to ¬H. This likelihood ratio shows us how much, and in what
direction (toward H or ¬H), our evidence E has caused us to change our beliefs from
what they were before we obtained evidence E . In short, likelihood ratios grade the force
of evidence in Bayesian analyses.
Here is an example of how likelihoods and their ratios provide a method for grading the
force of an item of evidence on some hypothesis. This is an example of a situation
involving a singular evidence item where we have nothing to count. Suppose we are
interested in determining whether or not the Green state is supplying parts necessary for
the construction of shaped explosive devices to a certain insurgent militia group in the
neighboring Orange state. Thus we are considering two hypotheses:
H: “The Greens are supplying parts necessary for the construction of shaped explosive
devices.”
¬H: “The Greens are not supplying parts necessary for the construction of shaped
explosive devices.”
Suppose we believe, before we have any evidence, that the prior probability of H is
PðH Þ ¼ 0:20. Because we must obey the rules for enumerative probabilities, we must also
13:51:08,
.002
1.3. Probabilistic Reasoning 13
say that Pð¬H Þ ¼ 0:80. This follows from the third axiom we discussed in Section 1.3.1. So,
our prior odds on H relative to ¬H have a value OddsðH : ¬H Þ ¼ PPðð¬H
HÞ 0:20 1
Þ ¼ 0:80 ¼ 4.
Suppose now that we receive the item of evidence E : A member of the Green’s
military was captured less than one kilometer away from a location in Orange at which
parts necessary for the construction of these shaped explosives were found.
We ask ourselves how likely is this evidence E if H were true, and how likely is this
evidence E if H were not true. Suppose we say that PðE jH Þ ¼ 0:80 and P ðE j¬H Þ ¼ 0:10.
We are saying that this evidence is eight times more likely if H were true than if H were not
true. So, our likelihood ratio for evidence E is LE ¼ PPððEE j¬H
jH Þ 0:8
Þ ¼ 0:1 ¼ 8.
We now have all the ingredients necessary in Bayes' rule to determine the posterior
odds and posterior probability of hypothesis H:
1
OddsðH : ¬HjE Þ ¼ OddsðH : ¬H ÞLE ¼ 8 ¼ 2:
4
This means that we now believe the posterior odds favoring H over ¬H are two to one. But
we started by believing that the prior odds of H to ¬H were one in four, so the evidence E
changed our belief by a factor of 8.
We could just as easily express this inference in terms of probabilities. Our prior
probability of H was PðH Þ ¼ 0:20. But our posterior probability PðHjE Þ ¼ 1þ2
2
¼ 23 ¼ 0:67.
So, in terms of probabilities, evidence E caused us to increase the probability of H by 0.47.
So, using this subjective Bayesian approach, we would be entitled to express the extent
of our uncertainty in an analysis using numerical probability assessments provided only
that they conform to Kolmogorov’s axioms.
In so many instances, we may not be sure what evidence is telling us, and so we wish to be
able to withhold a portion of our beliefs and not commit it to any particular hypothesis or
possible conclusion. A very important element in what Shafer terms Belief Functions is that
the weight of evidence means the degree of support evidence provides to hypotheses we are
considering. Shafer allows that we can grade the degree of support s on a 0 to 1 scale
13:51:08,
.002
14 Chapter 1. Introduction
similar to the scale for Kolmogorov probabilities; but we can do things with support
assignment s that the Kolmogorov additivity Axiom 3 does not allow.
To illustrate, suppose we revisit the issue discussed in the previous section about
whether or not the Green state is supplying parts necessary for the construction of shaped
explosive devices to a certain insurgent militia group in the neighboring Orange state. At
some stage, we are required to state our beliefs about the extent to which the evidence
supports H or ¬H. Here is our assessment:
What does this support assignment mean? We are saying that we believe the evidence
supports H exactly to degree s = 0.5, and that this evidence also supports ¬H exactly to
degree s = 0.3. But there is something about this evidence that makes us unsure about
whether it supports H or ¬H. So, we have left the balance of our s assignment, s = 0.2,
uncommitted among H or ¬H. In other words, we have withheld a portion of our beliefs
because we are not sure what some element of our evidence is telling us.
If we were required to obey Kolmogorov Axiom 3, we would not be allowed to be
indecisive in any way in stating our beliefs. Here is what our support assignment would
have to look like:
{H} {¬H}
s a 1a
In this case, we would be required to say that the evidence supports H to degree s = a,
and supports ¬H to degree s = (1 – a) in agreement with Axiom 3 since H and ¬H are
mutually exclusive and exhaustive. In short, Kolmogorov Axiom 3 does not permit us any
indecision in stating our beliefs; we must commit all of it to H and to ¬H. This, we believe,
would not be a faithful or accurate account of our beliefs.
But Shafer’s Belief Function approach allows us to cope with another difficulty
associated with Kolmogorov’s axioms. For centuries, it has been recognized that a
distinction is necessary between what has been termed mixed evidence and pure evi-
dence. Mixed evidence has some degree of probability under every hypothesis we are
considering. But pure evidence may support one hypothesis but say nothing at all about
other hypotheses. In other words, we may encounter evidence that we believe offers zero
support for some hypothesis. Here is another example involving our Green-Orange
situation. Suppose we encounter an item of evidence we believe supports H to a degree,
but we believe offers no support at all for ¬H. Here is our support assignment s for this
evidence:
s 0.5 0 0.5
In this situation, we are saying that the evidence supports H to degree s = 0.5, but offers
no support at all to ¬H. The rest of our support we leave uncommitted between H and ¬H.
But now we have to examine what s = 0 for ¬H means; does it mean that ¬H could not be
supported by further evidence? The answer is no, and the reason why it is no allows us to
13:51:08,
.002
1.3. Probabilistic Reasoning 15
compare what ordinary probabilities mean in comparison with what support s means.
This comparison is shown in Figure 1.6.
The (a) scale in Figure 1.6, for conventional or Kolmogorov probabilities, has a lower
boundary with a meaning quite different from the meaning of this lower boundary on
Shafer’s support scale shown in (b). The value 0 in conventional probability refers to an
event judged to be impossible and one you completely disbelieve. But all 0 means on
Shafer’s s scale is lack of belief, not disbelief. This is very important, since we can go from
lack of belief to some belief as we gather more evidence. But we cannot go from disbelief
to some belief. On a conventional probability scale, a hypothesis once assigned the
probability value 0 can never be resuscitated by further evidence, regardless of how strong
it may be. But some hypothesis, assigned the value s = 0, can be revised upward since we
can go from lack of belief to some belief in this hypothesis when and if we have some
further evidence to support it. Thus, s allows us to account for pure evidence in ways that
ordinary probabilities cannot do. We will refer to this scale again in Section 1.3.4 when
discussing Baconian probability.
Consider the evidence in the dirty bomb example which will be discussed in Section 2.2.
We begin by listing the hypotheses we are considering at this moment:
H 1: A dirty bomb will be set off somewhere in the Washington, D.C., area.
¬H 1 : A dirty bomb will not be set off in the Washington, D.C., area (it might be set off
somewhere else or not at all).
In the Belief Functions approach, we have just specified what is called a frame of
discernment, in shorthand a frame F. What this frame F ¼ fH 1 ; ¬H 1 g shows is how we
are viewing our hypotheses right now. We might, on further evidence, wish to revise our
frame in any one of a variety of ways. For example, we might have evidence suggesting
other specific places where a dirty bomb might be set off, such as in Annapolis, Maryland,
or in Tysons Corner, Virginia. So our frame F in this case might be:
All that is required in the Belief Functions approach is that the hypotheses in a frame be
mutually exclusive; they might or might not be exhaustive. The hypotheses are required to
0
be exhaustive in the Bayesian approach. So this revised frame F ¼ fH 1 ; H 2 ; H 3 g, as stated,
is not exhaustive. But we are assuming, for the moment at least, that these three
0 Conventional Probability 1
(a)
Disbelief or Impossible Certain or Complete Belief
0 Baconian Probability
(c)
Lack of Proof Proof
Figure 1.6. Different probability scales.
13:51:08,
.002
16 Chapter 1. Introduction
hypotheses are mutually exclusive: The dirty bomb will be set off at exactly one of these
three locations. But, on further evidence, we might come to believe that dirty bombs will
be set off in both Washington, D.C., and in Tysons Corner, Virginia. We know that the
terrorists we are facing have a preference for simultaneous and coordinated attacks. So,
00
our revised frame F might be:
H 1: A dirty bomb will be set off in Washington, D.C., and in Tysons Corner, Virginia.
H 2: A dirty bomb will be set off in Annapolis, Maryland.
The point of all this so far is that the Belief Functions approach allows for the fact that our
hypotheses may mutate or change as a result of new evidence we obtain. This is a major
virtue of this approach to evidential reasoning.
The next thing we have to consider is the power set of the hypotheses in a frame. This
power set is simply the list of all possible combinations of the hypotheses in this frame.
When we have n hypotheses in F, there are 2n possible combinations of our hypotheses,
including all of them and none of them. For example, when F ¼{H 1 , ¬H 1 }, the power set
consists of {H 1 }, {¬H 1 }, {H 1 , ¬H 1 }, and ∅, where ∅ = the empty set (i.e., none of the
0
hypotheses). For F ¼ fH 1 ; H 2 ; H 3 g, as just defined, there are 23 = 8 possible combin-
ations: {H 1 }, {H 2 }, {H 3 }, {H 1 , H 2 }, {H 1 , H 3 }, {H 2 , H 3 ), {H 1 , H 2 , H 3 } and ∅. Now, here comes
an important point about support function s: The assigned values of s for any item or body
of evidence must sum to 1.0 across the power set of hypotheses in a frame. The only
restriction is that we must set s{∅} = 0. We cannot give any support to the set of none of
the hypotheses we are considering.
More details about the Belief Functions approach are provided in Schum (1994 [2001a],
pp. 222–243).
13:51:08,
.002
1.3. Probabilistic Reasoning 17
Baconian probabilities have only ordinal properties and cannot be combined algebra-
ically in any way. The Baconian probability scale is shown as (c) in Figure 1.6, to be
compared with the conventional probability scale shown as (a) in Figure 1.6. On the
conventional probability scale, 0 means disproof; but on the Baconian scale, 0 simply
means lack of proof. A hypothesis now having zero Baconian probability can be revised
upward in probability as soon as we have some evidence for it. As noted, we cannot revise
upward in probability any hypothesis disproved, or having zero conventional probability.
You gathered some evidence, fair enough, quite a bit of it, in fact. But, how
many relevant questions you can think of were not answered by the evidence
you had? Depending upon the number of these unanswered questions, you
were out on an inferential limb that was longer and weaker than you
imagined it to be (see Figure 1.7). If you believed that these unanswered
questions would supply evidence that also favored H3, you were misleading
Your conclusion
that H3 is true
Questions unanswered
by your existing evidence
Your existing
evidence
13:51:08,
.002
18 Chapter 1. Introduction
yourself since you did not obtain any answers to them. The posterior probabil-
ity you determined by itself is not a good indicator of the weight of evidence.
What makes better sense is to say the weight of evidence depends on the
amount of favorable evidence you have and how completely it covers matters
you said were relevant. In your analysis, you completely overlooked the infer-
ential importance of questions your existing evidence did not answer.
Apart from the Baconian system, no other probability view focuses on evidential com-
pleteness and the importance of taking into account questions recognized as being
relevant that remain unanswered by the evidence we do have. This is why Jonathan
Cohen’s Baconian system is so important (Cohen, 1977; 1989). What we do not take into
account in our analyses can hurt us very badly.
In many instances, such as reasoning in intelligence analysis, we frequently have to
make inferences about matters for which we have scant evidence, or no evidence at all. In
other instances in which there may be available evidence, we may have no time to search
for it or consider it carefully. In such cases, we are forced to make assumptions or
generalizations that license inferential steps. But this amounts to giving an assumption
or a generalization the benefit of the doubt (without supporting it in any way), to believing
as if some conclusion were true (absent any evidential support for it), or to taking
something for granted without testing it in any way. All of these situations involve the
suppression of uncertainties.
It happens that only the Baconian probability system provides any guidance about how
to proceed when we must give benefit of doubt, believe as if, or take things for granted.
The major reason is that it acknowledges what almost every logician says about the
necessity for asserting generalizations and supplying tests of them in evidential reasoning.
Search the Bayesian or Belief Functions literature, and you will find almost no discussion
of generalizations (assumptions) and ancillary tests of them. Suppose we are interested in
inferring F from E, that is, P(F|E). Bayes’ rule grinds to a halt when we have no basis for
assessing the likelihoods P(E|F) and P(E|¬F). Bayesians counter by saying that we will
always have some evidence on which to base these judgments. But they never say what
this evidence is in particular cases and how credible it might or might not be. The Belief
Functions approach comes closer by saying that we can assess the evidential support for a
body of evidence that may include both directly relevant and at least some ancillary
evidence (i.e., evidence about other evidence). Following is an account of the Baconian
license for giving an assumption or generalization benefit of doubt, believing as if it were
true, or taking it for granted, provided that we are willing to mention all of the uncertain-
ties we are suppressing when we do so. Stated another way, we must try to account for all
of the questions we can think of that remain unanswered by the absence, or very scant
amount, of evidence.
Here are the essentials of Cohen’s Baconian approach to reasoning based on little or no
ancillary evidence to either support or undermine a generalization (Cohen 1977; 1989).
The first step, of course, is to make sure the generalization is not a non sequitur, that is,
that it makes logical sense. In the simplest possible case, suppose we are interested in
inferring proposition or event F from proposition or event E. The generalization G in doing
so might read, “If E has occurred, then probably F has occurred.” We recognize this if-then
statement as an inductive generalization since it is hedged. Second, we consider various
tests of this generalization using relevant ancillary evidence. Third, we consider how many
evidential tests of this generalization there might be. Suppose we identify N such tests. The
13:51:08,
.002
1.3. Probabilistic Reasoning 19
best case would be when we perform all N of these tests and they all produce results
favorable to generalization G. But we must not overlook generalization G itself; we do so
by assigning it the value 1; so we have N + 1 things to consider. Now we are in a position to
show what happens in any possible case.
First, suppose we perform none of these N evidential tests. We could still proceed by
giving generalization G the benefit of the doubt and detach a belief that F occurred (or will
occur) just by invoking this generalization G regarding the linkage between events
E and F. So, when no evidential tests are performed, we are saying: “Let’s believe as if
F occurred based on E and generalization G.” This would amount to saying that the
Baconian probability of event F is B(F) = 1/(N + 1). This expression is never a ratio; all it
says is that we considered just one thing in our inference about F from E, namely just the
generalization G. We could also say, “Let’s take event F for granted and believe that
F occurred (or will occur) because E occurred, as our generalization G asserts.” However,
note that in doing so, we have left all N ancillary evidential questions unanswered.
This we represent by saying that our inference of F from E has involved only one of the
N + 1 considerations and so we have (N + 1 – 1) = N, the number of questions we have left
unanswered. As far as evidential completeness is concerned, this is when the evidence we
have is totally incomplete. But the Baconian system allows us to proceed anyway based on
giving a generalization the benefit of doubt. But our confidence in this result should
be very low.
Now suppose we have performed some number k of the N possible ancillary evidential
tests of generalization G, as asserted previously, and they were all passed. The Baconian
probability of F in this situation is given by B(F) = (k + 1)/(N +1). The difference between
the numerator and denominator in such an expression will always equal the number
of unanswered questions as far as the testing of G is concerned. In this case, we have
(N + 1) – (k + 1) = N – k questions that were unanswered in a test of generalization G. How
high our confidence is that F is true depends on how high k + 1 is as compared to N + 1.
But now suppose that not all answers to these k questions are favorable to generaliza-
tion G. Under what conditions are we entitled to detach a belief that event F occurred,
based on evidence E, generalization G, and the k tests of G? The answer requires a
subjective judgment by the analyst about whether the tests, on balance, favor or disfavor
G. When the number of the k tests disfavoring G exceeds the number of tests favoring G,
we might suppose that we would always detach a belief that event F did not occur, since
G has failed more tests than it survived. But this will not always be such an easy judgment
if the number of tests G passed were judged to be more important than the tests it failed to
pass. In any case, there are N – k tests that remain unanswered. Suppose that k is quite
large, but the number of tests favorable to G is only slightly larger than the number of tests
unfavorable to G. In such cases, the analyst might still give event F the benefit of the doubt,
or believe, at least tentatively, as if F occurred pending the possible acquisition of further
favorable tests of G. And in this case, the confidence of the analyst in this conclusion
should also be very low.
Whatever the basis for an assumption or a benefit of doubt judgment there is, one of
the most important things about the Baconian approach is that the analyst must be
prepared to give an account of the questions that remain unanswered in evidential tests
of possible conclusions. This will be especially important when analysts make assump-
tions, or more appropriately, give generalizations the benefit of doubt, draw as if conclu-
sions, or take certain events for granted. These are situations in which analysts are most
vulnerable and in which Baconian ideas are most helpful.
13:51:08,
.002
20 Chapter 1. Introduction
13:51:08,
.002
1.3. Probabilistic Reasoning 21
following fragment from the letter sent by Albert Einstein to the United States President
Franklin D. Roosevelt, on the possibility of constructing nuclear bombs (Einstein, 1939):
. . . In the course of the last four months it has been made probable – through
the work of Joliot in France as well as Fermi and Szilárd in America – that it
may become possible to set up a nuclear chain reaction in a large mass of
uranium, by which vast amounts of power and large quantities of new
radium-like elements would be generated. Now it appears almost certain
that this could be achieved in the immediate future.
This new phenomenon would also lead to the construction of bombs, and it
is conceivable – though much less certain – that extremely powerful bombs of
a new type may thus be constructed. . . .
Verbal expressions of uncertainty are common in many areas. In the field of law, for
example, forensic standards of proof are always employed using words instead of
numbers. We all know about standards such as “beyond reasonable doubt” (in criminal
cases); “preponderance of evidence” (in civil cases); “clear and convincing evidence” (in
many Senate and congressional hearings); and “probable cause” (employed by magis-
trates to determine whether a person should be held in custody pending further hearings).
All the verbal examples just cited have a current name: They can be called Fuzzy
probabilities. Words are less precise than numbers. There is now extensive study of fuzzy
inference involving what has been termed approximate reasoning, which involves verbal
statements about things that are imprecisely stated. Here is an example of approximate
reasoning: “Since John believes he is overworked and underpaid, then he is probably not
very satisfied with his job.” We are indebted to Professor Lofti Zadeh (University of
California, Berkeley), and his many colleagues, for developing logics for dealing with fuzzy
statements, including Fuzzy probabilities (Zadeh, 1983; Negoita and Ralescu, 1975). In his
methods for relating verbal assessments of uncertainty with numerical equivalents, Zadeh
employed what he termed a possibility function, μ, to indicate ranges of numerical
probabilities a person might associate with a verbal expression of uncertainty. Zadeh
reasoned that a person might not be able to identify a single precise number he or she
would always associate with a verbal statement or Fuzzy probability. Here is an example of
a possibility function for the Fuzzy probability “very probable.”
Asked to grade what numerical probabilities might be associated with an analyst’s
Fuzzy probability of “very probable,” the analyst might respond as follows:
For me, “very probable” means a numerical probability of at least 0.75 and at
most 0.95. If it were any value above 0.95, I might use a stronger term, such
as “very, very probable.” I would further say that I would not use the term
“very probable” if I thought the probability was less than 0.75. In such cases,
I would weaken my verbal assessment. Finally, I think it is most possible (μ =
1.0) that my use of the verbal assessment “very probable” means something
that has about 0.85 of occurring. If the analyst decides that “very probable”
declines linearly on either side of μ = 1.0, we would have the possibility
function shown in Figure 1.8.
13:51:08,
.002
22 Chapter 1. Introduction
1.0
Very
Possibility
probable
0.5
13:51:08,
.002
1.3. Probabilistic Reasoning 23
13:51:08,
.002
24 Chapter 1. Introduction
Table 1.1. A Summary of Nonenumerative Uncertainty Methods and What They Best
Capture
Subjective Belief
Major Strength Bayes Functions Baconian Fuzzy
analysis, and how completely this evidence covered matters judged relevant to conclusions
that could be reached. A major question this form of analysis allows us to address is the
extent to which questions that have not been answered by existing evidence could have
altered the conclusion being reached. It would be quite inappropriate to assume that
answers to the remaining unanswered questions would, if they were obtained, all favor the
conclusion that was being considered. This, of course, requires us to consider carefully
matters relevant to any conclusion that are not addressed by available evidence.
The second entry in Table 1.1 notes that all four of the probabilistic methods have
very good ways for dealing with the inconclusive nature of most evidence, but they do
so in different ways. The Subjective Bayesian does so by assessing nonzero likelihoods
for the evidence under every hypothesis being considered. Their relative sizes indicate
the force that the evidence is judged to have on each hypothesis. But the Belief
Functions advocate assigns numbers indicating the support evidence provides for
hypotheses or subsets of them. We should be quick to notice that Bayesian likelihoods
do not grade evidential support, since in Belief Functions one can say that an item of
evidence provides no support at all to some hypothesis. But a Bayesian likelihood of
zero under a particular hypothesis would mean that this hypothesis is impossible and
should be eliminated. Offering no support in Belief Functions does not entail that this
hypothesis is impossible, since some support for this hypothesis may be provided by
further evidence. The Baconian acknowledges the inconclusive nature of evidence by
assessing how completely, as well as how strongly, the evidence favors one hypothesis
over others. In Fuzzy probabilities, it would be quite appropriate to use words in
judging how an item or body of evidence bears on several hypotheses. For example,
one might say, “This evidence is indeed consistent with H1 and H2, but I believe it
strongly favors H1 over H2.”
The third entry in the table first acknowledges the Belief Functions and Fuzzy
concerns about ambiguities and imprecision in evidence. In the Belief Functions
approach, one is entitled to withhold belief for some hypotheses in the face of ambiguous
evidence. In such cases, one may not be able to decide upon the extent to which the
evidence may support any hypothesis being considered, or even whether the evidence
13:51:08,
.002
1.4. Evidence-based Reasoning 25
supports any of them. Judgmental indecision is not allowed in the Bayesian system since
it assumes one can say precisely how strongly evidence judged relevant favors every
hypothesis being considered. Ambiguities in evidence may be commonly encountered.
The Fuzzy advocate will argue that ambiguities or imprecision in evidence hardly
justifies precise numerical judgments. In the face of fuzzy evidence, we can make only
fuzzy judgments of uncertainty.
The fourth entry in Table 1.1 shows that all four probability systems have very good
mechanisms for coping with dissonant evidence in which there are patterns of contradict-
ory and divergent evidence. Dissonant evidence is directionally inconsistent; some of it
will favor certain hypotheses and some of it will favor others. In resolving such inconsist-
encies, both the Bayesian and Belief Functions approaches will side with the evidence
having the strongest believability. The Bayesian approach to resolving contradictions is
especially interesting since it shows how “counting heads” is not the appropriate method
for resolving contradictions. In times past, “majority rule” was the governing principle.
Bayes’ rule shows that what matters is the aggregate believability on either side of a
contradiction. The Baconian approach also rests on the strength and aggregate believabil-
ity in matters of dissonance, but it also rests on how much evidence is available on either
side and upon the questions that remain unanswered. In Fuzzy terms, evidential disson-
ance, and how it might be resolved, can be indicated in verbal assessments of uncer-
tainty. In such instances, one might say, “We have dissonant evidence favoring both H1
and H2, but I believe the evidence favoring H1 predominates because of its very strong
believability.”
Row five in Table 1.1 concerns the vital matter of assessing the believability of
evidence. From considerable experience, we find that the Bayesian and Baconian
systems are especially important when they are combined. In many cases, these two
radically different schemes for assessing uncertainty are not at all antagonistic but are
entirely complementary. Let us consider a body of evidence about a human intelli-
gence (HUMINT) asset or informant. Ideas from the Baconian system allow us to ask,
“How much evidence do we have about this asset, and how many questions about
this asset remain unanswered?” Ideas from the Bayesian system allow us to ask, “How
strong is the evidence we do have about this asset?” (Schum, 1991; Schum and
Morris, 2007)
A ➔ necessarily B
Socrates is a man ➔ necessarily Socrates is mortal
A ➔ probably B
Julia was born in Switzerland ➔ probably Julia speaks German
13:51:08,
.002
26 Chapter 1. Introduction
A ➔ possibly B
There is smoke in the East building ➔ possibly there is fire
in the East building
Inductive Inference:
U(a1) and V(a1) When U(a1) was true, it was observed that V(a1) was also true
U(a2) and V(a2) When U(a2) was true, it was observed that V(a2) was also true
... ...
U(an) and V(an) When U(an) was true, it was observed that V(an) was also true
8x, U(x) ➔ Probably V(x) Therefore, whenever U(x) is true, V(x) is also probably true
Abductive Inference:
U(a1) ➔ V(a1) If U(a1) were true then V(a1) would follow as a matter of course
V(a1) V(a1) is true
13:51:08,
.002
1.4. Evidence-based Reasoning 27
testing of hypotheses also take place in response to one another, as indicated by the
feedback loops from the bottom of Figure 1.9.
13:51:08,
.002
28 Chapter 1. Introduction
13:51:08,
.002
1.5. Artificial Intelligence 29
hypothesis guides the collection of additional evidence, which is used to assess the
probability of each hypothesis (Meckl et al., 2015).
The following, for instance, are different hypotheses one may be interested in assessing
based on evidence:
Evidence-based reasoning, however, is often highly complex, and the conclusions are
necessarily probabilistic in nature because our evidence is always incomplete, usually
inconclusive, frequently ambiguous, commonly dissonant, and has imperfect believability
(Schum, 1994 [2001a]; Tecuci et al., 2010b). Arguments requiring both imaginative and
critical reasoning, and involving all known types of inference (deduction, induction, and
abduction), are necessary in order to estimate the probability of the considered hypoth-
eses. Therefore, evidence-based reasoning can be best approached through the mixed-
initiative integration of human imagination and computer knowledge-based reasoning
(Tecuci et al., 2007a, 2007b), that is, by using knowledge-based intelligent agents for
evidence-based reasoning. Therefore, in the next section, we briefly review the field of
artificial intelligence.
Artificial intelligence (AI) is the science and engineering domain concerned with the theory
and practice of developing systems that exhibit the characteristics we associate with intelli-
gence in human behavior, such as perception, natural language processing, problem solving
and planning, learning and adaptation, and acting on the environment. Its main scientific
goal is understanding the principles that enable intelligent behavior in humans, animals,
and artificial agents. This scientific goal directly supports several engineering goals, such
as developing intelligent agents, formalizing knowledge and mechanizing reasoning in all
areas of human endeavor, making working with computers as easy as working with
people, and developing human-machine systems that exploit the complementariness of
human and automated reasoning.
Artificial intelligence is a very broad interdisciplinary field that has roots in and
intersects with many domains, not only all the computing disciplines, but also mathemat-
ics, linguistics, psychology, neuroscience, mechanical engineering, statistics, economics,
control theory and cybernetics, philosophy, and many others. The field has adopted many
concepts and methods from these domains, but it has also contributed back.
While some of the developed systems, such as an expert system or a planning system,
can be characterized as pure applications of AI, most of the AI systems are developed as
13:51:08,
.002
30 Chapter 1. Introduction
components of complex applications to which they add intelligence in various ways, for
instance, by enabling them to reason with knowledge, to process natural language, or to
learn and adapt.
Artificial intelligence researchers investigate powerful techniques in their quest for
realizing intelligent behavior. But these techniques are pervasive and are no longer
considered AI when they reach mainstream use. Examples include time-sharing, symbolic
programming languages (e.g., Lisp, Prolog, and Scheme), symbolic mathematics systems
(e.g., Mathematica), graphical user interfaces, computer games, object-oriented program-
ming, the personal computer, email, hypertext, and even the software agents. While this
tends to diminish the merits of AI, the field is continuously producing new results and, due
to its current level of maturity and the increased availability of cheap computational
power, it is a key technology in many of today's novel applications.
The knowledge base is a type of long-term memory that contains data structures
representing the objects from the application domain, general laws governing them,
and actions that can be performed with them.
The perceptual processing module implements methods to process natural language,
speech, and visual inputs.
The problem-solving engine implements general problem-solving methods that use the
knowledge from the knowledge base to interpret the input and provide an appropriate
output.
The learning engine implements learning methods for acquiring, extending, and refin-
ing the knowledge in the knowledge base.
The action processing module implements the agent’s actions upon that environment
aimed at realizing the goals or tasks for which it was designed (e.g., generation of
13:51:08,
.002
1.5. Artificial Intelligence 31
An intelligent agent has an internal representation of its external environment that allows
it to reason about the environment by manipulating the elements of the representation.
For each relevant aspect of the environment, such as an object, a relation between objects,
a class of objects, a law, or an action, there is an expression in the agent’s knowledge base
that represents that aspect. For example, the left side of Figure 1.12 shows one way to
represent the simple world from the right side of Figure 1.12. The upper part is a
hierarchical representation of the objects and their relationships (an ontology). Under it
is a rule to be used for reasoning about these objects. This mapping between real entities
and their representations allows the agent to reason about the environment by manipu-
lating its internal representations and creating new ones. For example, by employing
natural deduction and its modus ponens rule, the agent may infer that cup1 is on table1.
The actual algorithm that implements natural deduction is part of the problem-solving
engine, while the actual reasoning is performed in the reasoning area (see Figure 1.11).
Since such an agent integrates many of the intelligent behaviors that we observe in humans,
it is also called a cognitive agent or system (Langley, 2012).
Most of the current AI agents, however, will not have all the components from
Figure 1.11, or some of the components will have very limited functionality. For example,
a user may speak with an automated agent (representing her Internet service provider)
that will guide her in troubleshooting the Internet connection. This agent may have
Knowledge Reasoning
Base Area
Action action
Learning Engine output
Processing
Figure 1.12. An ontology fragment and a reasoning rule representing a simple agent world.
13:51:08,
.002
32 Chapter 1. Introduction
advanced speech, natural language, and reasoning capabilities, but no visual or learning
capabilities. A natural language interface to a database may have only natural language
processing capabilities, while a face recognition system may have only learning and visual
perception capabilities.
slow fast
sloppy rigorous
forgetful precise
implicit explicit
subjective objective
but but
have common sense lack common sense
have intuition lack intuition
may find creative solutions in have poor ability to deal with
new situations new situations
13:51:08,
.002
1.6. Knowledge Engineering 33
Knowledge engineering is the area of artificial intelligence that is concerned with the design,
development, and maintenance of agents that use knowledge and reasoning to perform
problem solving and decision making tasks.
Knowledge engineering is a central discipline in the Knowledge Society, the society
where knowledge is the primary production resource instead of capital and labor (Druker,
1993). Currently, human societies are rapidly evolving toward knowledge societies and an
Integrated Global Knowledge Society because of the development of the information
technologies, the Internet, the World Wide Web, and the Semantic Web that no longer
restrict knowledge societies to geographic proximity and that facilitate the sharing, archiv-
ing, retrieving, and processing of knowledge (Schreiber et al., 2000; David and Foray, 2003;
UNESCO, 2005). Moreover, the Semantic Web, an extension of the World Wide Web in
which Web content is expressed both in a natural form for humans, and in a format that can
be understood by software agents, is becoming the main infrastructure for the Knowledge
Society, allowing knowledge-based agents to automatically find, integrate, process, and
share information (Allemang and Hendler, 2011; W3C, 2015).
13:51:08,
.002
34 Chapter 1. Introduction
The knowledge necessary to perform at such a level, plus the inference procedures used, can
be thought of as a model of the expertise of the best practitioners in that field.”
Two early and very influential expert systems were DENDRAL (Buchanan and Feigen-
baum, 1978) and MYCIN (Buchanan and Shortliffe, 1984). DENDRAL, an expert system for
organic chemistry, analyzed mass spectral data and inferred a complete structural hypoth-
esis of a molecule. MYCIN, a medical expert system, produced diagnoses of infectious
diseases and advised the physician on antibiotic therapies for treating them.
Expert systems and knowledge-based systems are often used as synonyms since all expert
systems are knowledge-based systems. However, not all the knowledge-based systems
are expert systems, such as the Watson natural language question-answering system
(Ferrucci et al., 2010), and the Siri personal assistant (2011).
Continuous advances in artificial intelligence, particularly with respect to knowledge
representation and reasoning, learning, and natural language processing, are reflected in
more and more powerful and useful knowledge-based systems that, as discussed Section
1.5.1, are now more commonly called knowledge-based agents, or simply intelligent agents.
In this book, we are primarily concerned with a very important and newer class of
intelligent agents, namely cognitive assistants, which have the following capabilities:
Expert systems, knowledge-based agents, and cognitive assistants are used in business,
science, engineering, manufacturing, military, intelligence, and many other areas (Durkin,
1994; Giarratano and Riley, 1994; Tecuci, 1998; Tecuci et al., 2001; 2008b). They are everywhere.
The following are examples of such successful systems.
Digital Equipment Corporation’s R1 (McDermott, 1982), which helped configure orders
for new computers, is considered the first successful commercial system. By 1986, it was
saving the company an estimated $40 million a year.
Intuit’s TurboTax, an American tax preparation software package initially developed by
Michael A. Chipman of Chipsoft in the mid-1980s (Forbes, 2013), helps you fill in your
taxes with the maximal deductions, according to the law.
The Defense Advanced Research Projects Agency’s (DARPA) DARP logistics planning
system was used during the Persian Gulf crisis of 1991 (Cross and Walker, 1994). It
planned with up to fifty thousand vehicles, cargo, and people, and reportedly more than
paid back DARPA’s thirty-year investment in artificial intelligence.
IBM’s Deep Blue chess playing system defeated Gary Kasparov, the chess world
champion, in 1997 (Goodman and Keene, 1997).
Disciple-COG agent for center of gravity analysis helped senior military officers from
the U.S. Army War College to learn how to identify the centers of gravity of the opposing
forces in complex war scenarios (Tecuci et al., 2002a; 2002b; 2008b).
NASA’s planning and scheduling systems helped plan and control the operations of
NASA’s spacecraft. For example, MAPGEN, a mixed-initiative planner, was deployed as a
mission-critical component of the ground operations system for the Mars Exploration
Rover mission (Bresina and Morris, 2007).
IBM’s Watson natural language question-answering system defeated the best human
players at the quiz show Jeopardy in 2011 (Ferrucci et al., 2010).
13:51:08,
.002
1.6. Knowledge Engineering 35
knowledge-
intensive task
configuration
intelligence analysis interpretation design scheduling
diagnosis
computer
configuration
medical mechanical
diagnosis diagnosis
13:51:08,
.002
36 Chapter 1. Introduction
A synthetic task is one which takes as input the requirements of an object or system and
produces the corresponding object or system, as, for instance, in designing a car based on
given specifications.
Under each type of analytic or synthetic task, one may consider more specialized
versions of that task. For example, special cases of diagnosis are medical diagnosis and
mechanical diagnosis. Special cases of mechanical diagnosis are car diagnosis, airplane
diagnosis, and so on.
The importance of identifying such types of tasks is that one may create general models
for solving them (e.g., a general model of diagnosis) that could guide the development of
specialized systems (e.g., a system to diagnose Toyota cars).
Schreiber et al. (2000, pp. 123–166) present abstract problem-solving models for many
of the task types in Figure 1.13. These abstract models can provide initial guidance when
developing knowledge-based agents to perform such tasks.
13:51:08,
.002
1.6. Knowledge Engineering 37
Design means configuring objects under constraints, such as designing an elevator with
a certain capacity and speed, as done by the SALT system (Marcus, 1988), or designing a
computer system with certain memory, speed, and graphical processing characteristics,
based on a set of predefined components.
Planning means finding a set of actions that achieve a certain goal, such as determining
the actions that need to be performed in order to repair a bridge, as done by the Disciple-
WA agent (Tecuci et al., 2000), which will be discussed in Section 12.2. A more complex
example is collaborative emergency response planning, illustrated by Disciple-VPT
(Tecuci et al., 2008c), which is presented in Section 12.5.
Scheduling means allocating sequences of activities or jobs to resources or machines on
which they can be executed, such as scheduling the lectures in the classrooms of a
university, or scheduling the sequence of operations needed to produce an object on
the available machines in a factory.
Debugging means prescribing remedies for malfunctions, such as determining how to
tune a computer system to reduce a particular type of performance problem.
Assignment means creating a partial mapping between two sets of objects, such as
allocating offices to employees in a company or allocating airplanes to gates in an airport.
Knowledge
Base
Results
13:51:08,
.002
38 Chapter 1. Introduction
Figure 1.15. Knowledge and reasoning based on problem reduction and solution synthesis.
if-then structures that indicate the conditions under which a general complex problem
(such as P1g) can be reduced to simpler problems. Other rules indicate how the solutions
of simpler problems can be combined into the solution of the more complex problem.
These rules are applied to generate the reasoning tree from the right part of Figure 1.15.
Figure 1.14 illustrates the conventional approach to building a knowledge-based agent.
A knowledge engineer interviews the subject matter expert to understand how the expert
reasons and solves problems, identifies the knowledge used by the expert, and then
represents it into the agent's knowledge base. For instance, the knowledge engineer may
represent the knowledge acquired from the expert as an ontology of concepts and a set of
reasoning rules expressed with these concepts, like those from Figure 1.15. Then the agent
is used to solve typical problems, and the subject matter expert analyzes the generated
solutions (e.g., the reasoning tree from the right side of Figure 1.15), and often the
knowledge base itself, to identify errors. Referring to the identified errors, the knowledge
engineer corrects the knowledge base.
After more than two decades of work on expert systems, Edward Feigenbaum (1993)
characterized the knowledge-based technology as a tiger in a cage:
The systems offer remarkable cost savings; some dramatically “hot selling”
products; great return-on-investment; speedup of professional work by
factors of ten to several hundred; improved quality of human decision making
(often reducing errors to zero); and the preservation and “publishing” of
knowledge assets of a firm. . . . These stories of successful applications,
repeated a thousand fold around the world, show that knowledge-based
technology is a tiger. Rarely does a technology arise that offers such a wide
range of important benefits of this magnitude. Yet as the technology moved
through the phase of early adoption to general industry adoption, the
response has been cautious, slow, and “linear” (rather than exponential).
The main reason for this less than exponential growth of expert systems lies in the
difficulty of capturing and representing the knowledge of the subject matter expert in
the system’s knowledge base. This long, difficult, and error-prone process is known as
the “knowledge acquisition bottleneck” of the system development process. But why is
13:51:08,
.002
1.6. Knowledge Engineering 39
13:51:08,
.002
40 Chapter 1. Introduction
Soware systems
Soware systems developed developed and used by
by computer experts and persons who are not
used by persons who are not computer experts
Soware systems computer experts
developed and used
by computer experts Cognive
Personal Assistants
Computers
Mainframe
Computers
Semanc
Web
Semantic Web, where typical computer users will be able to both develop and use special
types of software agents.
The learning agent technology illustrated by the Disciple approach attempts to
change the way the knowledge-based agents are built, from “being programmed” by a
knowledge engineer to “being taught” by a user who does not have prior knowledge
engineering or computer science experience. This approach will allow typical computer
users, who are not trained knowledge engineers, to build by themselves cognitive
assistants. Thus, non–computer scientists will no longer be only users of generic pro-
grams developed by others (such as word processors or Internet browsers), as they are
today, but also agent developers themselves. They will be able to train their cognitive
assistants to help them with their increasingly complex tasks in the Knowledge Society,
which should have a significant beneficial impact on their work and life. This goal is
consistent with the Semantic Web vision of enabling typical users to author Web content
that can be understood by automated agents (Allemang and Hendler, 2011; W3C, 2015).
Bill Gates has also stressed the great potential and importance of software assistants
(Simonite, 2013).
Because the subject matter expert teaches a Disciple agent similarly to how the expert
would teach a student, through explicit examples and explanations, a trained Disciple
agent can be used as an assistant by a student, learning from the agent’s explicit reasoning.
Alternatively, Disciple may behave as a tutoring system, guiding the student through a
series of lessons and exercises. Educational Disciple agents have been developed for
intelligence analysis (Tecuci et al., 2011a, 2011b) and for center of gravity determination
(Tecuci et al., 2008b). Thus the Disciple agents also contribute to advancing “Personalized
Learning,” which is one of the fourteen Grand Challenges for the Twenty-first Century
identified by the U.S. National Academy of Engineering (NAE, 2008).
13:51:08,
.002
1.7. Obtaining Disciple-EBR 41
Le 2008; Marcu 2009), and several books (e.g., Tecuci, 1998; Tecuci et al., 2008b). Some of
the most representative implementations of this evolving theory and technology are
discussed in Chapter 12. The rest of this book, however, focuses on the most recent
advances of this theory and technology that enables the development of Disciple agents
for evidence-based reasoning (EBR) tasks such as those introduced in Section 1.4.3.
The corresponding agent development environment is called Disciple-EBR, which can
be used by a subject matter expert, with support from a knowledge engineer, to develop a
knowledge-based agent incorporating his or her expertise.
Disciple-EBR (the Disciple learning agent shell for evidence-based reasoning) will be
used throughout this book to explain knowledge engineering concepts, principles, and
methods using a hands-on approach. It will also be the software environment used in the
agent design and development project.
There is also a reduced version of Disciple-EBR, called Disciple-CD (the Disciple
cognitive assistant for “Connecting the Dots”). This version was created for the end-user
who has no knowledge engineering experience and receives no support from a
knowledge engineer (Tecuci et al., 2014). Therefore, when using Disciple-CD, the user
does not have access to any Disciple-EBR module that may require any kind of
knowledge engineering support, such as Ontology Development, Rule Learning, or
Rule Refinement.
We have written a book for intelligence analysis courses, titled Intelligence Analysis as
Discovery of Evidence, Hypotheses, and Arguments: Connecting the Dots (Tecuci et al.,
2016), which uses Disciple-CD. This is because Disciple-CD incorporates a significant
amount of knowledge about evidence and its properties, uses, and discovery to help the
students acquire the knowledge, skills, and abilities involved in discovering and processing
of evidence and in drawing defensible and persuasive conclusions from it, by employing
an effective learning-by-doing approach. The students can practice and learn how to link
evidence to hypotheses through abductive, deductive, and inductive reasoning that estab-
lish the basic credentials of evidence: its relevance, believability or credibility, and infer-
ential force or weight. They can experiment with “what-if” scenarios and can study the
influence of various assumptions on the final result of analysis. So, their learning experi-
ence will be a joint venture involving the intelligence analysis book together with their
interaction with Disciple-CD.
Disciple-CD is a significant improvement over an earlier system that we have
developed for intelligence analysis, called TIACRITIS (Teaching Intelligence Analysts
Critical Thinking Skills), and it subsumes all the reasoning and learning capabilities
of TIACRITIS that have been described in several papers (Tecuci et al., 2010b;
2011a; 2011b).
Disciple-EBR (Disciple, for short) is a learning agent shell for evidence-based reasoning. It
is a research prototype implemented in Java and tested on PC. Disciple-EBR is a stand-
alone system that needs to be installed on the user’s computer.
For installation requirements and to download the system, visit http://lac.gmu.edu/
KEBook/Disciple-EBR/. At this address, you will also find instructions on how to install
and uninstall Disciple-EBR, a section with frequently asked questions (FAQs), and a
section that allows users to submit error reports to the developers of the system.
13:51:08,
.002
42 Chapter 1. Introduction
1.1. Consider the following illustrations of the concepts data, information, and
knowledge:
Data: the color red.
Information: red tomato.
Knowledge: If the tomato is red, then it is ripe.
Data: the sequence of dots and lines “. . .–. . .”
Information: the “S O S” emergency alert.
Knowledge: If there is an emergency alert, then start rescue operations.
Provide two other illustrations of these concepts.
1.5. Give an example of a fact F and of evidence about F. In general, what is the
difference between a fact and evidence about that fact?
1.6. Formulate a hypothesis. Indicate an item of evidence that favors this hypothesis, an
item of evidence that disfavors this hypothesis, and an item of information that is
not evidence for this hypothesis.
1.8. What is abduction? Give an example of abductive reasoning. Provide other explan-
ations or hypotheses that are less plausible. Specify a context where one of these
alternative explanatory hypotheses would actually be more plausible.
1.9. A doctor knows that the disease hepatitis causes the patient to have yellow eyes
90 percent of the time. The doctor also knows that the probability that a patient has
hepatitis is one in one hundred thousand, and the probability that any patient has
yellow eyes is one in ten thousand. What is the probability that a patient with
yellow eyes has hepatitis?
1.10. Suppose that in answering a multiple-choice test question with five choices, a
student either knows the answer, with probability p, or she guesses it with prob-
ability 1 – p. Assume that the probability of answering a question correctly is 1 for a
student who knows the answer. If the student guesses the answer, she chooses one
of the options with equal probability. What is the probability that a student knew
the answer, given that she answered it correctly? What is this probability in the case
of a true-false test question?
1.11. Consider a hypothesis H and its negation ¬H. Suppose you believe, before you
have any evidence, that the prior probability of H is PðH Þ ¼ 0:30. Now you receive
an item of evidence E and ask yourself how likely is this evidence E if H were true,
and how likely is this evidence E , if H were not true. Suppose you say that
P ðE jH Þ ¼ 0:70 and P ðE j¬H Þ ¼ 0:10. What are your prior odds OddsðH : ¬H Þ,
13:51:08,
.002
1.8. Review Questions 43
and how have these odds changed as a result of the evidence E ? What is the
posterior probability PðHjE Þ?
1.12. Suppose in Question 1.11 you said that the prior probability of H is P ðH Þ ¼ 0:20
and the posterior probability P ðHjE Þ ¼ 0:95. What would be the force of evidence
E (i.e., the likelihood ratio LE*) that is implied by these assessments you
have made?
1.13. Think back to the very first time you were ever tutored about probability, what it
means, and how it is determined. What were you told about these matters? Then
describe your present views about these probability matters.
1.14. As we noted, the subjective Bayesian view of probability lets us assess prob-
abilities for singular, unique, or one-of-a-kind events, provided that our
assessed probabilities obey the three Kolmogorov axioms we discussed
regarding enumerative probabilities. First, is there any way of showing that
these axioms for enumerative probabilities also form the basis for ideal or
optimal probability assessments in the nonenumerative case? Second, can
this really be the rational basis for all probability assessments based on
evidence?
1.15. Show how Bayes’ rule supplies no method for incorporating “pure evidence” as
does the Belief Function system.
(1) Premise One: (a) All the beans from this bag are white.
Premise Two: (b) These beans are from this bag.
Conclusion: (c) These beans are white.
(2) Premise One: (b) These beans are from this bag.
Premise Two: (c) These beans are white.
Conclusion: (a) All the beans from this bag are white.
(3) Premise One: (a) All the beans from this bag are white.
Premise Two: (c) These beans are white.
Conclusion: (b) These beans are from this bag.
13:51:08,
.002
44 Chapter 1. Introduction
Deductive Inference
Premise One:
Premise Two:
Conclusion:
Inductive Inference
Premise One:
Premise Two:
Conclusion:
Abductive Inference
Premise One:
Premise Two:
Conclusion:
1.21. Give an example of an observation and of several hypotheses that would explain it.
1.24. Describe the generic architecture of an intelligent agent and the role of each main
component.
1.25. Which are two main types of knowledge often found in the knowledge base of
an agent?
1.26. What are some of the complementary abilities of humans and computer agents?
1.27. What would be a good mixed-initiative environment for problem solving and
decision making? What are some key requirements for such an environment?
1.28. How do assumptions enable mixed-initiative problem solving? How do they enable
problem solving in the context of incomplete information?
13:51:08,
.002
1.8. Review Questions 45
1.30. Which are some other examples of the analytic tasks introduced in Section
1.6.2.1?
1.31. Which are some other examples of the synthetic tasks introduced in Section
1.6.2.2?
13:51:08,
.002
Evidence-based Reasoning:
2 Connecting the Dots
The “connecting the dots” metaphor seems appropriate for characterizing evidence-based
reasoning. This metaphor may have gained its current popularity following the terrorist
attacks in New York City and Washington, D.C., on September 11, 2001. It was frequently
said that the intelligence services did not connect the dots appropriately in order to have
possibly prevented the catastrophes that occurred. Since then, we have seen and heard
this metaphor applied in the news media to inferences in a very wide array of contexts, in
addition to the intelligence, including legal, military, and business contexts. For example,
we have seen it applied to allegedly faulty medical diagnoses; to allegedly faulty conclu-
sions in historical studies; to allegedly faulty or unpopular governmental decisions; and in
discussions involving the conclusions reached by competing politicians. What is also true
is that the commentators on television and radio, or the sources of written accounts of
inferential failures, never tell us what they mean by the phrase “connecting the dots.”
A natural explanation is that they have never even considered what this phrase means
and what it might involve.
But we have made a detailed study of what “connecting the dots” entails. We have
found this metaphor very useful, and quite intuitive, in illustrating the extraordinary
complexity of the evidential and inferential reasoning required in the contexts we have
mentioned. Listening or seeing some media accounts of this process may lead one to
believe that it resembles the simple tasks we performed as children when, if we connected
some collection of numbered dots correctly, a figure of Santa Claus, or some other familiar
figure, would emerge. Our belief is that critics employing this metaphor in criticizing
intelligence analysts and others have very little awareness of how astonishingly difficult the
process of connecting unnumbered dots can be in so many contexts (Schum, 1987).
A natural place to begin our examination is by trying to define what is meant by the
metaphor "connecting the dots" when it is applied to evidence-based reasoning tasks:
46
13:50:24,
.003
2.1. How Easy Is It to Connect the Dots? 47
“Connecting the Dots” refers to the task of marshaling thoughts and evidence in the
generation or discovery of productive hypotheses and new evidence, and in the construction
of defensible and persuasive arguments on hypotheses we believe to be most favored by the
evidence we have gathered and evaluated.
The following represents an account of seven complexities in the process of “connect-
ing the dots.”
13:50:24,
.003
48 Chapter 2. Evidence-based Reasoning: Connecting the Dots
Thus, the second type of dot concerns ideas we have about how some evidential dots
are connected to matters we are trying to prove or disprove. We commonly refer to the
matters to be proved or disproved as hypotheses. Hypotheses commonly refer to possible
alternative conclusions we could entertain about matters of interest in an analysis. The
other dots, which we call idea dots, come in the form of links in chains of reasoning or
arguments we construct to link evidential dots to hypotheses. Of course, hypotheses are
also ideas. Each of these idea dots refers to sources of uncertainty or doubt we believe to
be interposed between our evidence and our hypotheses. This is precisely where imagina-
tive reasoning is involved. The essential task for the analyst is to imagine what evidential
dots mean as far as hypotheses or possible conclusions are concerned. Careful critical
reasoning is then required to check on the logical coherence of sequences of idea dots in
our arguments or chains of reasoning. In other words, does the meaning we have attached
to sequences of idea dots make logical sense?
So, a basic inference we encounter is whether or not E did occur based on our evidence
E*i. Clearly, this inference rests upon what we know about the believability of source .
There are some real challenges here in discussing the believability of source . Section 4.7
of this book is devoted to the task of assessing the believability of the sources of our
evidence. As we will see, Disciple-EBR (as well as Disciple-CD and TIACRITIS, introduced
in Section 1.6.3.3) already knows much about this crucial task.
But there are even distinctions to be made in what we have called evidential dots. Some
of these dots arise from objects we obtain or from sensors that supply us with records or
images of various sorts. So one major kind of evidential dot involves what we can call
tangible evidence that we can observe for ourselves to see what events it may reveal. In
many other cases, we have no such tangible evidence but must rely upon the reports of
human sources who allegedly have made observations of events of interest to us. Their
reports to us come in the form of testimonial evidence or assertions about what they have
observed. Therefore, an evidential dot E*i can be one of the following types:
Tangible evidence, such as objects of various kinds, or sensor records such as those
obtained by signals intelligence (SIGINT), imagery intelligence (IMINT), measurement
and signature intelligence (MASINT), and other possible sources
Testimonial evidence obtained from human sources (HUMINT)
The origin of one of the greatest challenges in assessing the believability of evidence
is that we must ask different questions about the sources of tangible evidence than
13:50:24,
.003
2.1. How Easy Is It to Connect the Dots? 49
those we ask about the sources of testimonial evidence. Stated another way, the
believability attributes of tangible evidence are different from the believability attri-
butes of testimonial evidence. Consider again the evidential dot concerning the two
men carrying backpacks. This is an example of tangible evidence. We can all examine
this videotape to our heart’s content to see what events it might reveal. The most
important attribute of tangible evidence is its authenticity: Is this evidential dot what
it is claimed to be? The FBI claims that this videotape was recorded on April 15, 2013,
on Boyleston Street in Boston, Massachusetts, where the bombings occurred, and
recorded before the bombings occurred. Our imaginations are excited by this claim
and lead to questions such as those that would certainly arise in the minds of defense
attorneys during trial. Was this videotape actually recorded on April 15, 2013? Maybe it
was recorded on a different date. If it was recorded on April 15, 2013, was it recorded
before the bombings occurred? Perhaps it was recorded after the bombings occurred.
And, was this videotape actually recorded on Boyleston Street in Boston, Massachu-
setts? It may have been recorded on a different street in Boston, or perhaps on a street
in a different city.
But there is another difficulty that is not always recognized that can cause endless
trouble. While, in the case of tangible evidence, believability and credibility may be
considered as equivalent terms, human sources of evidence have another characteristic
apart from credibility; this characteristic involves their competence. As we discuss in
Section 4.7.2, the credibility and competence characteristics of human sources must not
be confused; to do so invites inferential catastrophes, as we will illustrate. The questions
required to assess human source competence are different from those required to assess
human source credibility. Competence requires answers to questions concerning the
source’s actual access to, and understanding of, the evidence he or she reports. Credibil-
ity assessment for a testimonial source requires answers to questions concerning the
veracity, objectivity, and observational sensitivity or accuracy of the source. Disciple-EBR
knows the credibility-related questions to ask of tangible evidence and the competence-
and credibility-related questions to ask of HUMINT sources.
There is no better way of illustrating the importance of evidence believability assess-
ments than to show how such assessments form the very foundation for all arguments we
make from evidence to possible conclusions. In many situations, people will mistakenly
base inferences on the assumption that an event E has occurred just because we have
evidence E*i from source . This amounts to the suppression of any uncertainty we have
about the believability of source (whatever this source might be). In Figure 2.2 is a
{H, not H}
{G, not G}
Relevance
links in the
argument
{F, not F}
13:50:24,
.003
50 Chapter 2. Evidence-based Reasoning: Connecting the Dots
simple example illustrating this believability foundation; it will also allow us to introduce
the next problem in connecting the dots.
What this figure shows is an argument from evidence E*i as to whether or not hypoth-
esis H is true. As shown, the very first stage in this argument concerns an inference about
whether or not event E actually occurred. This is precisely where we consider whatever
evidence we may have about the believability of source . We may have considerable
uncertainty about whether or not event E occurred. All subsequent links in this argument
concern the relevance of event E on hypothesis H. As we noted in Figure 2.1, these
relevance links connect the idea dots we discussed. As Figure 2.2 shows, each idea dot is
a source of uncertainty associated with the logical connection between whether or not
event E did occur and whether or not H is true. Consideration of these relevance links is
our next problem in connecting the dots.
13:50:24,
.003
2.1. How Easy Is It to Connect the Dots? 51
which come to nothing. However, from several civilian flying schools in the United States
came word (to the FBI) that persons from the Middle East were taking flying lessons,
paying for them in cash, and wanting only to learn how to steer and navigate heavy aircraft
but not how to make takeoffs and landings in these aircraft. By itself, this information,
though admittedly strange, may not have seemed very important. But, taken together,
these two items of information might have caused even Inspector Lestrade (the rather
incompetent police investigator in Sherlock Holmes stories) to generate the hypothesis
that there would be attacks on the World Trade Center using hijacked airliners. The
hijackers would not need to learn how to make takeoffs; the aircrafts’ regular pilots would
do this. There would be no need for the hijackers to know how to land aircraft, since no
landings were intended, only crashes into the World Trade Center and the Pentagon. Why
were these two crucial items of information not considered together? The answer seems to
be that they were not shared among relevant agencies. Information not shared cannot
be considered jointly, with the result that their joint inferential impact could never have
been assessed. For all time, this may become the best (or worst) example of failure to
consider evidence items together. Even Sherlock Holmes would perhaps not have inferred
what happened on September 11, 2001, if he had not been given these two items of
information together.
The problem, however, is that here we encounter a combinatorial explosion, since the
number of possible combinations of two or more evidential dots is exponentially related to
the number of evidential dots we are considering. Suppose we consider having some
number N of evidential dots. We ask the question: How many combinations C of two or
more evidential dots are there when we have N evidential dots? The answer is given by the
following expression: C = 2N – (N + 1). This expression by itself does not reveal how quickly
this combinatorial explosion takes place. Here are a few examples showing how quickly
C mounts up with increases in N:
There are several important messages in this combinatorial analysis for evidence-based
reasoning. The first concerns the size of N, the number of potential evidential dots that
might be connected. Given the array of sensing devices and human observers available,
the number N of potential evidential dots is as large as you wish to make it. In most
analyses, N would certainly be greater than one hundred and would increase as time
passes. Remember that we live in a nonstationary world in which things change and we
find out about new things all the time. So, in most cases, even if we had access to the
world’s fastest computer, we could not possibly examine all possible evidential dot
combinations, even when N is quite small.
Second, trying to examine all possible evidential dot combinations would be the act of
looking through everything with the hope of finding something. This would be a silly thing to
do, even if it were possible. The reason, of course, is that most of the dot combinations
would tell us nothing at all. What we are looking for are combinations of evidential dots that
interact or are dependent in ways that suggest new hypotheses or possible conclusions. If we
would examine these dots separately or independently, we would not perceive these new
possibilities. A tragic real-life example is what happened on September 11, 2001.
13:50:24,
.003
52 Chapter 2. Evidence-based Reasoning: Connecting the Dots
Figure 2.3 is an abstract example involving four numbered evidential dots. The
numbers might indicate the order in which we obtained them. In part (a) of the figure,
we show an instance where these four dots have been examined separately or independ-
ently, in which case they tell us nothing interesting. Then someone notices that, taken
together, these four dots combine to suggest a new hypothesis HK that no one has thought
about before, as shown in part (b) of the figure. What we have here is a case of evidential
synergism in which two or more evidence items mean something quite different when they
are examined jointly than they would mean if examined separately or independently. Here
we come to one of the most interesting and crucial evidence subtleties or complexities that
have, quite frankly, led to intelligence failures in the past: failure to identify and exploit
evidential synergisms. We will address this matter in other problems we mention concern-
ing connecting the dots.
It might be said that the act of looking through everything in the hope of finding
something is the equivalent of giving yourself a prefrontal lobotomy, meaning that you
are ignoring any imaginative capability you naturally have concerning which evidential
dot combinations to look for in your analytic problem area. What is absolutely crucial
in selecting dot combinations to examine is an analyst’s experience and imaginative
reasoning capabilities. What we should like to have is a conceptual “magnet” that we
could direct at a base of evidential dots that would “attract” interesting and important
dot combinations.
17
13:50:24,
.003
2.1. How Easy Is It to Connect the Dots? 53
forming this linkage come in the form of propositions or statements indicating possible
sources of doubt or uncertainty in the imagined linkage between the item of information
and hypotheses being considered. For a simple example, look again at Figure 2.2 (p. 49),
where we show a connection between evidence E*i and hypothesis H. An analyst has an
item of information from source concerning the occurrence of event E that sounds very
interesting. This analyst attempts to show how event E, if it did occur, would be relevant in
an inference about whether hypothesis H is true or not. So the analyst forms the following
chain of reasoning involving idea dots. The analyst says, “If event E were true, this would
allow us to infer that event F might be true, and if F were true, this would allow us to infer
that event G might be true. Finally, if event G were true, this would make hypothesis
H more probable.” If this chain of reasoning is defensible, the analyst has established the
relevance of evidence E*i on hypothesis H.
In forming this argument, the analyst wisely begins with the believability foundation for
this whole argument: Did event E really occur just because source says it did? Also notice
in Figure 2.2 that we have indicated the uncertainty associated with each idea dot in this
argument. For example, the analyst only infers from E that F might have occurred and so
we note that we must consider F and ‘not F’ as possibilities. The same is true for the other
idea dot G and for the hypothesis dot H.
There are several important things to note about relevance arguments; the first concerns
their defense. Suppose the argument in Figure 2.2 was constructed by analyst . shows this
argument to analyst , who can have an assortment of quibbles about this argument. Suppose
says, “You cannot infer F directly from E; you need another step here involving event
K. From E you can infer that K occurred, and then if K occurred, then you can infer F.” Now
comes analyst , who also listens to ’s argument. says, “I think your whole argument is
wrong. I see a different reasoning route from E to hypothesis H. From E we can infer event R,
and from R we can infer event S, and from S we can infer T, which will show that hypothesis
H is less probable.” Whether or not there is any final agreement about the relevance of
evidence E*i, analyst has performed a real service by making the argument openly and
available for discourse and criticism by colleagues. There are several important messages here.
First, there is no such thing as a uniquely correct argument from evidence to hypoth-
eses. What we all try to avoid are disconnects or non sequiturs in the arguments we
construct. But even when we have an argument that has no disconnects, someone may be
able to come up with a better argument. Second, we have considered only the simplest
possible situation in which we used just a single item of potential evidence. But intelli-
gence analysis and other evidence-based reasoning tasks are based on masses of evidence
of many different kinds and from an array of different sources. In this case, we are obliged
to consider multiple lines of argument that can be connected in different ways. It is
customary to call these complex arguments inference networks.
From years of experience teaching law students to construct defensible and persuasive
arguments from evidence, we have found that most of them often experience difficulty in
constructing arguments from single items of evidence; they quickly become overwhelmed
when they are confronted with argument construction involving masses of evidence.
But they gain much assistance in such tasks by learning about argument construction
methods devised nearly a hundred years ago by a world-class evidence scholar named
John H. Wigmore (1863–1943). Wigmore (1913, 1937) was the very first person to study
carefully what today we call inference networks. We will encounter Wigmore’s work in
several places in our discussions, and you will see that Disciple-EBR employs elements of
Wigmore’s methods of argument construction.
13:50:24,
.003
54 Chapter 2. Evidence-based Reasoning: Connecting the Dots
There is also a message here for critics, such as newswriters and the talking heads on
television. These critics always have an advantage never available to practicing intelligence
analysts. Namely, they know how things turned out or what actually happened in some
previously investigated matter. In the absence of clairvoyance, analysts studying a problem
will never know for sure, or be able to predict with absolute certainty, what will happen in
the future. A natural question to ask these critics is, “What arguments would you have
constructed if all you knew was what the analysts had when they made their assessments?”
This would be a very difficult question for them to answer fairly, even if they were given
access to the classified evidence the analysts may have known at the time.
13:50:24,
.003
2.1. How Easy Is It to Connect the Dots? 55
complex and possibly interrelated arguments. The mind boggles at the enormity of the
task of assessing the force or weight of a mass of evidence commonly encountered in
intelligence analysis when we have some untold numbers of sources of believability and
relevance uncertainties to assess and combine (Schum, 1987). We are certain that critics of
intelligence analysts have never considered how many evidential and idea dots there
would be to connect.
So, the question remains: How do we assess and combine the assorted uncertainties in
complex arguments in intelligence analysis and in any other context in which we have the
task of trying to make sense out of masses of evidence? Here is where controversies arise.
The problem is that there are several quite different views among probabilists about what
the force or weight of evidence means and how it should be assessed and combined across
evidence in either simple or complex arguments: Bayesian, Belief Functions, Baconian,
and Fuzzy (Schum, 1994[2001a]). Each of these views has something interesting to say,
but no one view says it all, as discussed in Section 1.3.6.
Later in this book, we will discuss how Disciple-EBR allows you to assess and combine
probabilistic judgments in situations in which many such judgments are required. There is
further difficulty as far as judgments of the weight or force of evidence are concerned.
Analysts, or teams of analysts, may agree about the construction of an argument but
disagree, often vigorously, about the extent and direction of the force or weight this
argument reveals. There may be strong disagreements about the believability of sources
of evidence or about the strength of relevance linkages. These disagreements can be
resolved only when arguments are made carefully and are openly revealed so that they
can be tested by colleagues. A major mission of Disciple-EBR is to allow you to construct
arguments carefully and critically and encourage you to share them with colleagues so that
they can be critically examined.
There is one final matter of interest in making sense out of masses of evidence and
complex arguments. Careful and detailed argument construction might seem a very
laborious task, no matter how necessary it is. Now consider the task of revealing the
conclusions resulting from an analysis to some policy-making “customer” who has deci-
sions to make that rest in no small part on the results of an intelligence analysis. What this
customer will probably not wish to see is a detailed inference network analysis that
displays all of the dots that have been connected and the uncertainties that have been
assessed and combined in the process. A fair guess is that this customer will wish to have a
narrative account or a story about what the analysis predicts or explains. In some cases,
customers will require only short and not extensive narratives. This person may say, “Just
tell me the conclusions you have reached and briefly why you have reached them.” So the
question may be asked: Why go to all the trouble to construct defensible and persuasive
arguments when our customers may not wish to see their details?
There is a very good answer to the question just raised. Your narrative account of an
analysis must be appropriately anchored on the evidence you have. What you wish to be able
to tell is a story that you believe contains some truth; that is, it is not just a good story. The
virtue of careful and critical argument construction is that it will allow you to anchor your
narrative not only on your imagination, but also on the care you have taken to subject your
analysis to critical examination. There is no telling what questions you might be asked about
your analysis. Rigor in constructing your arguments from your evidence is the best protec-
tion you have in dealing with customers and other critics who might have entirely different
views regarding the conclusions you have reached. Disciple-EBR is designed to allow you
and others to evaluate critically the arguments you have constructed.
13:50:24,
.003
56 Chapter 2. Evidence-based Reasoning: Connecting the Dots
13:50:24,
.003
2.2. Sample Evidence-based Reasoning Task 57
H: A dirty bomb
will be set off in
the Washington,
D.C., area
Insight
E*: Article on
cesium-137
canister
missing
But let us assume that the cesium-137 canister is indeed missing. Then it is possible
that it was stolen. But it is also possible that it was misplaced, or maybe it was used in a
project at the XYZ Company without being checked out from the warehouse?
However, let us assume that the cesium-137 canister was indeed stolen. It is then
possible that it might have been stolen by a terrorist organization, but it is also possible
that it might have been stolen by a competitor or by an employee, and so on.
This is the process of evidence in search of hypotheses, shown in the left side of Figure 1.9
(p. 27). You cannot conclude that a dirty bomb will be set off in the Washington, D.C., area
(i.e., hypothesis H5) until you consider all the alternative hypotheses and show that those
on the chain from E* to H5 are actually more likely than their alternatives. But to analyze all
these alternative hypotheses and make such an assessment, you need additional evidence.
How can you get it? As represented in the middle of Figure 1.9, you put each hypothesis to
work to guide you in the collection of additional evidence. This process is discussed in the
next section.
13:50:24,
.003
58 Chapter 2. Evidence-based Reasoning: Connecting the Dots
H3: stolen
possibly Q
by terrorist
Abduction
organization
H2: stolen
P
H1: missing
E*: Article on
cesium-137
canister
missing
13:50:24,
.003
2.2. Sample Evidence-based Reasoning Task 59
H3: stolen
H’3: stolen by H”3: stolen by
possibly Q
by terrorist
competitor employee
Abduction
organization
E*: Article on
cesium-137
canister
missing
Deduction Induction
H13: was not H11: H12: H13:
H11: H12:
checked out almost very very
was in is not in
from certain likely likely
warehouse warehouse
warehouse
inferential force
very likely
E*
relevance believability
Ralph, the supervisor of the warehouse, reports that certain very likely
the cesium-137 canister is registered as being in the warehouse
and that no one at the XYZ Company had checked it out,
but it is not located anywhere in the hazardous materials locker.
He also indicates that the lock on the hazardous materials locker
appears to have been forced.
13:50:24,
.003
60 Chapter 2. Evidence-based Reasoning: Connecting the Dots
Table 2.2 Evidence Collection Tasks Obtained from the Analysis in Figure 2.7
Collection Task1: Look for evidence that the cesium-137 canister was in the XYZ warehouse before
being reported as missing.
Collection Task2: Look for evidence that the cesium-137 canister is no longer in the XYZ
warehouse.
Collection Task3: Look for evidence that the cesium-137 canister was not checked out from the
XYZ warehouse.
Table 2.3 Information Obtained through the Collection Tasks in Table 2.2
INFO-002-Ralph: Ralph, the supervisor of the warehouse, reports that the cesium-137 canister is
registered as being in the warehouse and that no one at the XYZ Company had checked it out, but it is
not located anywhere in the hazardous materials locker. He also indicates that the lock on the
hazardous materials locker appears to have been forced.
Table 2.4 Dots or Items of Evidence Obtained from Willard and Ralph
E001-Willard: Willard’s report in the Washington Post that a canister containing cesium-137 was
missing from the XYZ warehouse in Baltimore, MD.
E002-Ralph: Ralph’s testimony that the cesium-137 canister is registered as being in the XYZ
warehouse.
E003-Ralph: Ralph’s testimony that no one at the XYZ Company had checked out the cesium-137
canister.
E004-Ralph: Ralph’s testimony that the canister is not located anywhere in the hazardous materials
locker.
E005-Ralph: Ralph’s testimony that the lock on the hazardous materials locker appears to have
been forced.
at hand. Consider, for example, the information provided by Willard in his Washington
Post article. You parse it to extract the relevant information represented as E001-Willard in
Table 2.4. Similarly, Ralph’s testimony from Table 2.3 provides you with several dots or
items of evidence that are relevant to assessing the hypotheses from Figure 2.7. These
items of evidence are represented in Table 2.4.
This is the process of hypotheses in search of evidence that guides you in collecting new
evidence. The next step now is to assess the probability of hypothesis H1 based on the
collected evidence, as represented in the right-hand side of Figure 1.9 (p. 27), and
discussed in the next section.
13:50:24,
.003
2.2. Sample Evidence-based Reasoning Task 61
by using probabilities that are expressed in words rather than in numbers. In particular,
we will use the ordered symbolic probability scale from Table 2.5. This is based on a
combination of ideas from the Baconian and Fuzzy probability systems (Schum, 1994
[2001a], pp. 243–269). As in the Baconian system, “no support” for a hypothesis means that
we have no basis to consider that the hypothesis might be true. However, we may later
find evidence that may make us believe that the hypothesis is “very likely,” for instance.
To assess the hypotheses, you first need to attach each item of evidence to the
hypothesis to which it is relevant, as shown in the right side of Figure 2.7. Then you need
to establish the relevance and the believability of each item of evidence, which will result in
the inferential force of that item of evidence on the corresponding hypothesis, as illustrated
in the right side of Figure 2.7 and explained in the following.
So let us consider the hypothesis “H13: cesium-137 canister was not checked out from
the warehouse” and the item of evidence “E003-Ralph: Ralph’s testimony that no one at
the XYZ Company had checked out the cesium-137 canister.”
Relevance answers the question: So what? How does E003-Ralph bear on the hypothesis
H13 that you are trying to prove or disprove? If you believe what E003-Ralph is telling us,
then H13 is “certain.”
Believability answers the question: To what extent can you believe what E003-Ralph is
telling you? Let us assume this to be “very likely.”
Inferential force or weight answers the question: How strong is E003-Ralph in favoring
H13? Obviously, an item of evidence that is not relevant to the considered hypothesis will
have no inferential force on it and will not convince you that the hypothesis is true. An
item of evidence that is not believable will have no inferential force either. Only an item
of evidence that is both very relevant and very believable will make you believe that
the hypothesis is true. In general, the inferential force of an item of evidence (such as
E003-Ralph) on a hypothesis (such as H13) is the minimum of its relevance and its
believability. You can therefore conclude that, based on E003-Ralph, the probability of
the hypothesis H13 is “very likely” (i.e., the minimum of “certain” and “very likely”), as
shown in Figure 2.7.
Notice in Figure 2.7 that there are two items of evidence that are relevant to the
hypothesis H12. In this case, the probability of H12 is the result of the combined (max-
imum) inferential force of these two items of evidence.
Once you have the assessments of the hypotheses H11, H12, and H13, the assessment of
the hypothesis H1 is obtained as their minimum, because these three subhypotheses are
necessary and sufficient conditions for H1. Therefore, all need to be true in order for H1 to
be true, and H1 is as weak as its weakest component.
Thus, as shown at the top-right side of Figure 2.7, you conclude that it is “very likely”
that the cesium-137 canister is missing from the warehouse.
Notice that this is a process of multi-intelligence fusion since, in general, the assessment
of a hypothesis involves fusing different types of evidence.
Figure 2.8 summarizes the preceding analysis, which is an illustration of the general
framework from Figure 1.9 (p. 27).
no support < likely < very likely < almost certain < certain
13:50:24,
.003
62 Chapter 2. Evidence-based Reasoning: Connecting the Dots
Figure 2.8. An illustration of the general framework from Figure 1.9 (p. 27).
Now that you have concluded “H1: missing,” you repeat this process for the upper
hypotheses (i.e., H2: stolen, H’2: misplaced, and H”2: used in project), as will be discussed
in the next section.
13:50:24,
.003
2.2. Sample Evidence-based Reasoning Task 63
min
Scenario: The truck entered the company, the canister was stolen from the locker,
the canister was loaded into the truck, and the truck left with the canister.
H: cesium-137 H: cesium-137
very very
canister stolen canister loaded
likely likely
from locker into truck
min
Figure 2.9. Another example of hypothesis-driven evidence collection and hypothesis testing.
The second hypothesized action in the scenario (i.e., “cesium-137 canister stolen from
locker”) is further decomposed into two hypotheses. The first one was already analyzed:
“It is very likely that the cesium-137 canister is missing from the warehouse.” The second
subhypothesis (“Warehouse locker was forced”) is supported both by Ralph’s testimony
(i.e., E005-Ralph in Table 2.4) and by the professional locksmith Clyde who was asked to
examine it (E006-Clyde: Professional locksmith Clyde testimony that the lock has been
forced, but it was a clumsy job).
After continuing the process for the remaining hypothesized actions in the scenario and
fusing all the discovered evidence, you and Disciple-EBR conclude that it is “very likely”
that the cesium-137 canister was stolen.
You repeat the same process for the other two competing hypotheses, “H’2: misplaced,”
and “H”2: used in project.” However, you find no evidence that the cesium-137 canister
might have been misplaced. Moreover, you find disfavoring evidence for the second com-
peting hypothesis: Grace, the Vice President for Operations at XYZ, tells us that no one at the
XYZ Company had checked out the canister for work on any project (E014-Grace).
Thus you conclude that the canister-137 was stolen and you continue the analysis with
investigating the next level up of competing hypotheses: “H3: stolen by terrorist organiza-
tion”; “H’3: stolen by competitor”; and “H”3: stolen by employee.” Of course, at any point,
the discovery of new evidence may lead you to refine your hypotheses, define new
hypotheses, or eliminate existing hypotheses.
This example is not as simple as it may be inferred from this presentation. It is the
methodology that guides you and makes it look simple. Many things can and will indeed
go wrong. But the computational theory of evidence-based reasoning and Disciple-EBR
13:50:24,
.003
64 Chapter 2. Evidence-based Reasoning: Connecting the Dots
provide you the means to deal with them. Based on evidence, you come up with some
hypotheses, but then you cannot find evidence to support any of them. So you need to
come up with other hypotheses, and you should always consider alternative hypotheses.
The deduction-based decomposition approach guides you on how to look for evidence,
but your knowledge and imagination also play a crucial role. As illustrated here, you
imagined a scenario where the cesium-137 canister was stolen with a truck. But let us now
assume that you did not find supporting evidence for this scenario. Should you conclude
that the cesium-137 canister was not stolen? No, because this was just one scenario. If you
can prove it, you have an assessment of your hypothesis. However, if you cannot prove it,
there still may be another scenario on how the cesium-137 canister might have been
stolen. Maybe the cesium-137 canister was stolen by someone working at the XYZ
Company. Maybe it was stolen by Ralph, the administrator of the warehouse. The import-
ant thing is that each such scenario opens a new line of investigation and a new way to
prove the hypothesis.
Having established that the cesium-137 canister was stolen, you would further like to
determine by whom and for what purpose. If it is for building and setting off a dirty bomb,
you would like to know who will do this; where exactly in the Washington, D.C., area will
the bomb be set off; precisely when this action will happen; what form of dirty bomb will
be used; and how powerful it will be. These are very hard questions that the computational
theory of evidence-based reasoning (as well as its current implementation in Disciple-
EBR) will help you answer.
One major challenge in performing such an analysis is the development of argumenta-
tion structures. An advantage of using an advanced tool, such as Disciple-EBR, is that it
can learn reasoning rules from the user to greatly facilitate and improve the analysis of
similar hypotheses, as will be shown in the next chapters of this book.
In conclusion, the computational theory of evidence-based reasoning presented in
this volume, as well as its current implementation in Disciple-EBR, provides a framework
for integrating the art and science of evidence-based reasoning, to cope with its aston-
ishing complexity.
More details about intelligence analysis are presented in our book Intelligence Analy-
sis as Discovery of Evidence, Hypotheses, and Arguments: Connecting the Dots (Tecuci
et al., 2016). Other examples of applications of evidence-based reasoning are presented
in the next section.
13:50:24,
.003
2.3. Other Evidence-based Reasoning Tasks 65
from George Mason University. We will assume that we have a set of monitoring agents
that perform persistent surveillance of a computer network and host systems. They
include login monitors, file system monitors, internal network monitors, port monitors,
outside network monitors, and others. These monitoring agents are constantly looking for
indicators and warnings of insider missions.
Let us further assume that the evidence collection agents have detected a record inside
network logs involving an instance of denied access from the device with the Internet
Protocol address IP1 to the device with the address IP2, at time T. This is evidence E* at the
bottom of Figure 2.10. While this denied service access might be a normal event of an
accidental access to a shared resource generated by legitimate browsing on a local host, it
can also be an indication of an improper attempt to access a shared network resource.
Therefore, the question is: What insider missions might explain this observation?
By means of abductive reasoning, which shows that something is possibly true, the
analysis agent may formulate the chain of explanatory hypotheses from the left side of
Figure 2.10:
It is possible that the observed denied access from IP1 to IP2 is part of a sequence of
attempted network service accesses from IP1 to other IPs (hypothesis H11). It is further
possible that this is part of a network scan for files (hypothesis H21). It is possible that this
network scan is in fact a malicious attempt to discover network shared files (hypothesis
H31), part of malicious covert reconnaissance (hypothesis H41), which may itself be part
of a covert reconnaissance, collection, and exfiltration mission (hypothesis H51).
As one can notice, these hypotheses are very vague at this point. Moreover, for each of
these hypotheses there are alternative hypotheses, as shown in the right-hand side of
Figure 2.10. For example, the denied access may be part of a single isolated attempt
(hypothesis H12). However, even in the case where we have established that the denied
service access is part of a sequence of accesses (hypothesis H11), it is still possible that this
13:50:24,
.003
66 Chapter 2. Evidence-based Reasoning: Connecting the Dots
sequence of accesses is due to recent policy changes that affected the user’s access to
specific services or objects (hypothesis H22).
What the agent needs to do is to test each of these alternative hypotheses, starting
from bottom up, to make sure that, if an insider mission is actually being performed, it is
promptly detected. Each of the bottom-level alternative hypotheses (i.e., H11 and H12)
is put to work to guide the collection of relevant evidence (see Figure 2.11). The
discovered evidence may lead to the refinement of the hypotheses, including the
possible formulation of new hypotheses, and these refined hypotheses may lead to
new evidence. Next, the discovered evidence is used to assess the probabilities of the
bottom-level alternative hypotheses.
Assuming that the most likely hypothesis was determined to be H11, this process
continues with the next level up of alternative hypotheses (i.e., H21 and H22), using them
to collect evidence, and assessing which of them is most likely, as illustrated in Figure 2.12.
Now, since H21 was assessed as being “very likely,” it is possible that H31 is true. But it is
also possible that H32 is true. The right-hand side of Figure 2.13 illustrates the process of
using the hypothesis H31 in order to guide the collection of evidence to test it:
Figure 2.12. Evidence collection and assessment for the next level up of alternative hypotheses.
13:50:24,
.003
2.3. Other Evidence-based Reasoning Tasks 67
If “H31: Non-account owner on IP1 scanned the network for shared files, between
T1 and T2” were true
Then the following subhypotheses would also be true:
“Non-account owner accessed an account on computer C1 between T1 and T2”
“Network scan for shared files from IP1, between T1 and T2”
To collect evidence for the first subhypothesis, we need to consider possible scenarios for
a non-account owner to access computer C1. The scenario illustrated in Figure 2.13 is a
physical access to C1 in the conference room CR1 where C1 is located. Another possible
scenario is a virtual access to C1.
As discussed in Section 2.2.4, such scenarios have enormous heuristic value in advan-
cing the investigation. In this case, for example, we are guided toward searching for
persons who were present in CR1 between T1 and T2. As indicated at the bottom of
Figure 2.13, there are several possible strategies to look for such persons:
Search for persons who entered CR1 before T2, based on door logs.
Search for persons who entered CR1 before T2, based on scheduled meetings participants.
Search for persons who entered CR1 before T2, based on outside surveillance video
camera VC1.
Notice that these are precise queries that can be answered very fast. Notice also that, in this
particular case, they involve all-source (non-computer) evidence. It is very important to be
able to use both computer and non-computer evidence to discover cyber insider threats.
This process will continue until the top-level hypothesis is assessed, as illustrated in
Figure 2.14.
13:50:24,
.003
68 Chapter 2. Evidence-based Reasoning: Connecting the Dots
H: H: H:
H41: P1, non-account Covert recon- Covert Covert
likely likely likely
owner on IP1 performed naissance collection exfiltration
reconnaissance H: no
between T1 and T2 support
(very likely) Covert H: H: no
Cs of IPs
scan (Cs) likely support
H31: Non-account owner of network Cs of processes H: likely
Cs of host H: likely
on IP1 scanned network
for shared files Cs of directory H:
between T1 and T2 Network almost
(very likely) scan data certain
for files temporary
email
file shares trash
H21: Network scan for H: H: very H: no
shared files from IP1, almost likely support
accounts
between T1 and T2 Sequence certain
(very likely) of accesses printers
remote
H11: Sequence of accesses access web services
to network services for
E*
several systems from IP1,
between T1 and T2
(very likely)
H52: P1, non-account owner on IP1 H52: no
E*: Log record of denied performed covert reconnaissance support
access to network service for remote vulnerabilities
from IP1 to IP2 at time T
13:50:24,
.003
2.3. Other Evidence-based Reasoning Tasks 69
E*i: There is evidence of road work at 1:17 am at the Al Batha highway junction.
➔ Ei: It is possible that there is indeed road work at the Al Batha highway junction.
➔ Ha: It is possible that the road work is for blocking the road.
➔ Hc: It is possible that there is ambush preparation at the Al Batha highway junction.
➔ Hk: It is possible that there is an ambush threat at the Al Batha highway junction.
Figure 2.15. Wide-area motion imagery (background image reprinted from www.turbophoto.com/
Free-Stock-Images/Images/Aerial%20City%20View.jpg).
(Hb: Ambush location), and there should also be some observable ambush preparation
activities (Hc: Ambush preparation). Further on, to be a good location for ambush requires
the corresponding route to be used by the U.S. forces (Hd: Blue route), and there should also
be cover at that location (He: Cover). This directly guides the analyst to check whether the U.S.
forces are using that route. It also guides the analyst to analyze images of that location for the
existence of cover. Having obtained the corresponding evidence, the analyst assesses its
support of the corresponding subhypotheses, and Disciple-EBR automatically aggregates
these assessments, concluding that it is almost certain that the location is good for ambush.
The rest of the analysis is developed in a similar way. The “Hc: Ambush preparation”
activity is automatically decomposed into three simpler activities: “Hf: Deployment,”
“Ha: Road blocking,” and “Hp: Move to cover.” Further on, “Hf: Deployment” is decom-
posed into “Hg: Vehicle deployment” and “Hm: Insurgent vehicle,” the last one being
further decomposed into two subhypotheses. All these simpler subhypotheses guide the
collection of corresponding relevant evidence which, in this example, is found and
evaluated, leading Disciple-EBR to infer that the top-level hypothesis is very likely.
Let us now consider that the analyst does not perform real-time analysis, but forensic
analysis. The ambush has already taken place (hence a third subhypothesis, “Hq: Ambush
execution,” of the top-level hypothesis), and the goal is to trace back the wide-area motion
13:50:24,
.003
70 Chapter 2. Evidence-based Reasoning: Connecting the Dots
Hk: Ambush Hk1: Ambush Hk: Ambush threat to Hk: It is very likely that there is an
threat deception U.S. forces at Al Batha ambush threat to U.S. forces at Al
highway junction Batha highway junction
Figure 2.16. Another illustration of the general reasoning framework from Figure 1.9 (p. 27).
imagery in order to identify the participants together with the related locations and events.
The analyst and Disciple-EBR develop a similar analysis tree, as discussed previously,
which leads to the following hypotheses from the bottom of Figure 2.16: “Hn: Vehicle
departed from facility” and “Ho: Insurgent facility.” Thus, forensic analysis leads to the
discovery of the facility from which the insurgents have departed, identifying it as an
insurgent facility.
The same approach can also be used as a basis for the development of collaborative
autonomous agents engaged in persistent surveillance and interpretation of unconstrained
dynamic environments, continuously generating and testing hypotheses about the state of
the world. Consider, for example, the use of such agents in counterinsurgency operations
with the mission to automatically discover threat activities, such as IEDs, suicide bombers,
rocket launches, kidnappings, or ambushes. Discovery by sensor agents of road work at a
location that is often used by the U.S. forces leads to the hypothesis that there is an
ambush threat at that location. This hypothesis is then automatically decomposed into
simpler and simpler hypotheses, as discussed previously, guiding the agents to discover
additional evidence. Then the ambush threat hypothesis is automatically assessed and an
alert is issued if its probability is above a certain threshold.
13:50:24,
.003
2.3. Other Evidence-based Reasoning Tasks 71
H1: The left tree has lost its leaves because there is too much water at its root.
H2: The left tree has lost its leaves because it is older than the right tree.
H3: The left tree has lost its leaves because it is ill.
She then invited each student to pick one explanatory hypothesis, which led to several
groups: a “water” group, an “age” group, an “illness” group, and so on. She asked each
group to use the Inquirer assistant in order to plan and conduct a simple investigation to
test their preferred hypothesis.
For the next three weeks, science periods were set aside for each group to carry out
its investigation. Each group used the Inquirer assistant to conduct its investigation,
13:50:24,
.003
72 Chapter 2. Evidence-based Reasoning: Connecting the Dots
H1: The H2: The H3: The H1: The left tree H1: It is very likely that
left tree left tree left tree has lost its leaves the left tree has lost
has lost has lost has lost … because there is its leaves because
its leaves its leaves its leaves too much water there is too much
because because it because at its root. water at its root.
there is is older it is ill.
too much than the
water at right tree. H1a: H1b: Too H1a: It is very H1b: It is
its root. There is much likely that almost
too much water at a there is too certain that
water at tree’s root much water too much
the root causes it at the root water at a
of the left to lose its of the left tree’s root
tree. leaves tree. causes it
and die. to lose its
leaves
and die.
E*: Evidence that
the left tree has
lost its leaves.
Figure 2.17. Systematic inquiry with a cognitive assistant (based on NRC, 2000, pp. 5–11).
H1: The left tree has lost its leaves because there is too much water at its root.
The group’s members reasoned that, if this hypothesis were true, then two simpler
subhypotheses need to be true:
H1a: There is too much water at the root of the left tree.
H1b: Too much water at a tree’s root causes it to lose its leaves and die.
Therefore, they decomposed H1 into H1a and H1b by entering them into Inquirer (see the
middle part of Figure 2.17) and decided to assess them based on evidence. As a result,
Inquirer guided the students to look for both favoring and disfavoring evidence for each of
these two subhypotheses.
To collect relevant evidence for H1a, the students decided to look at the ground around
the two trees every hour that they could. They took turns on making individual observa-
tions, and since some of them lived near the school, their observations continued after
school hours and on weekends. Even though they missed some hourly observations, they
had sufficient data that they introduced into Inquirer as evidence E*1 favoring H1a, because
their observations confirmed the presence of excessive water at the root of the tree. As a
result, Inquirer extended the analysis of H1a from the middle of Figure 2.17 with the blue
tree shown in Figure 2.18, asking the students to assess the relevance of E1 (the event
indicated by the evidence E*1) with respect to H1a, as well as the believability of the
evidence E*1.
Inquirer reminded the students that relevance answers the question: So what? May E1
change my belief in the truthfulness of H1a? The students’ answer was: Assuming that E*1
is believable, it is very likely that there is too much water at the root of the left tree. They
did this by selecting one value from the following list of probabilistic assessments
displayed by Inquirer: {no support, likely, very likely, almost certain, certain}. They also
justified their choice with the fact that during their observations, the tree was standing in
water, which means that it is very likely that there is too much water there.
13:50:24,
.003
2.3. Other Evidence-based Reasoning Tasks 73
Inquirer also reminded them that believability answers the question: Can we believe
what E*1 is telling us? Here the students’ answer was: Believability of E*1 is almost certain,
since a few data points were missing and, on rare occasions, the left tree was not standing
in water.
Based on the students’ assessments, Inquirer determined the inferential force of E*1 on
H1a, as shown by the green reasoning tree in Figure 2.18: Based on E*1, it is very likely that
there is too much water at the root of the left tree. Inquirer explained the students that
inferential force answers the question: How strong is E*1 in favoring H1a? An item of
evidence, such as E*1, will make us believe that the hypothesis H1a is true if and only if
E1* is both highly relevant and highly believable. Therefore, the inferential force of E*1 on
H1a was computed as the minimum of the relevance of E1 (very likely) and the believability
of E*1 (almost certain), the minimum of which is very likely.
The students agreed that E1* also supports the hypothesis “H1b: Too much water at a
tree’s root causes it to lose its leaves and die.” They assessed the relevance of E1 as likely
(because E1 is only one instance of this phenomenon), and the believability of E1* as
almost certain, leading Inquirer to assess the inferential force of E1* on H1b as likely, which
is the minimum of the two (see the left part of Figure 2.19).
One of the students recalled that several months ago the leaves on one of his mother’s
geraniums had begun to turn yellow. She told him that the geranium was getting too much
water. This item of information was represented in Inquirer as item of evidence E*2
favoring the hypothesis H1b. The students agreed to assess E2‘s relevance as likely (because
geranium is a different type of plant) and E*2‘s believability as very likely (because although
the mother has experience with plants, she is not a professional), leading Inquirer to
13:50:24,
.003
74 Chapter 2. Evidence-based Reasoning: Connecting the Dots
compute E*2‘s inferential force on H1b as likely (see the bottom-middle part of Figure 2.19).
Additionally, Mrs. Graham gave the group a pamphlet from a local nursery entitled Growing
Healthy Plants. The water group read the pamphlet and found that when plant roots are
surrounded by water, they cannot take in air from the space around the roots and they
essentially “drown.” This item of information was represented in Inquirer as item of
evidence E*3 favoring the hypothesis H1b. The students agreed to assess E3‘s relevance as
certain (since it, in fact, asserted the hypothesis) and the believability of E*3 as almost certain
(because this information is from a highly credible expert), leading Inquirer to compute E*3‘s
inferential force on H1b as almost certain (see the bottom-right part of Figure 2.19). Addition-
ally, Inquirer computed the inferential force of all favoring evidence (i.e., E*1, E*2, and E*3)
on H1b as almost certain, by taking the maximum of them. This is also the probability of H1b,
because no disfavoring evidence was found. However, if any disfavoring evidence would
have been found, then the Inquirer would have needed to determine whether, on balance,
the totality of evidence favors or disfavors H1b, and to what degree.
Having assessed the probability of H1a as very likely and that of H1b as almost certain,
the students and Inquirer inferred that the probability of their top-level hypothesis H1 is
the minimum of the two because both are required to infer H1 (see the top-right part of
Figure 2.17). Finally, Inquirer automatically generated a report describing the analysis
logic, citing sources of data used and the manner in which the analysis was performed.
The report was further edited by the water group before being presented to the class,
together with the reports of the other teams.
As different groups presented and compared their analyses, the class learned that some
evidence – such as that from the group investigating whether the trees were different – did
not explain the observations. The results of other investigations, such as the idea that the
trees could have a disease, partly supported the observations. But the explanation that
seemed most reasonable to the students, that fit all the observations and conformed with
what they had learned from other sources, was “too much water.” After their three weeks
of work, the class was satisfied that together the students had found a reasonable answer
to their question.
13:50:24,
.003
2.3. Other Evidence-based Reasoning Tasks 75
The Inquirer cognitive assistant illustrates a computational view of the inquiry process
as ceaseless discovery of evidence, hypotheses, and arguments, through evidence in
search of hypotheses, hypotheses in search of evidence, and evidentiary testing of hypoth-
eses. This allows students hands-on experiences with this process, as illustrated in the
previous section and described in a more general way in Section 1.4.2. The hypothetical
Inquirer cognitive assistant incorporates this general model of inquiry, together with a
significant amount of knowledge about the properties, uses, discovery, and marshaling of
evidence. This allows it to be used as a general tool supporting the learning of those
inquiry-based practices that all sciences share, as advocated by the National Research
Council (NRC) framework for K–12 science education (NRC, 2011). This can be done
through a sequence of hands-on exercises in biology, chemistry, or physics, allowing the
students to experience the same inquiry-based scientific practices in multiple domains.
For example, in one exercise, Inquirer will teach the students how a complex hypothesis
is decomposed into simpler hypotheses, and how the assessments of the simpler hypoth-
eses are combined into the assessment of the top-level hypothesis, as was illustrated
in Figure 2.17.
In another exercise, Inquirer will teach the students how to assess the relevance and
the believability of evidence. This case study will provide both a decomposition tree (like
the one in the middle of Figure 2.17, with the elementary hypotheses H1a and H1b) and a
set of items of information, some relevant to the considered hypotheses and some
irrelevant. The students will be asked to determine which item of information is
relevant to which elementary hypothesis, and whether it is favoring or disfavoring
evidence. They will also be asked to assess and justify the relevance and the believability
of each item of evidence. After completing their analysis, the teacher and Inquirer
will provide additional information, asking the students to update their analysis in the
light of the new evidence. Finally, the students will present, compare, and debate their
analyses in class.
In yet another exercise, Inquirer will provide an analysis tree like the one from the
middle of Figure 2.17 but no items of information, asking the students to look for relevant
evidence (e.g., by searching the Internet or by performing various experiments) and to
complete the analysis.
In a more complex exercise, Inquirer will present a scenario with some unusual
characteristics, like the two trees previously discussed. The students will be asked to
formulate competing hypotheses that may explain the surprising observations, use the
formulated hypotheses to collect evidence, and use the evidence to assess each hypoth-
esis. Then they will compare their analyses of the competing hypotheses in terms of the
evidence used and assessments made, and will select the most likely hypothesis. Inquirer
will assist the students in this process by guiding them in decomposing hypotheses,
searching for evidence, assessing the elementary hypotheses, combining the assessments
of the simpler hypotheses, comparing the analyses of the competing hypotheses, and
producing an analysis report.
Inquirer can also provide many opportunities for collaborative work, in addition to
those that we have illustrated. For example, a complex hypothesis will be decomposed
into simpler hypotheses, each assessed by a different student. Then the results obtained
by different students will be combined to produce the assessment of the complex
hypothesis. Or different students will analyze the same hypothesis. Then they will
compare and debate their analysis and evidence and work together toward producing
a consensus analysis.
13:50:24,
.003
76 Chapter 2. Evidence-based Reasoning: Connecting the Dots
Consistent with the writers of the National Science Education Standards, who “treated
inquiry as both a learning goal and as a teaching method” (NRC 2000, p. 18), Inquirer is
envisioned as both a teaching tool for teachers and as a learning assistant for students.
For example, the teacher will demonstrate some of these exercises in class. Other exercises
will be performed by the students, under the guidance of the teacher and with the
assistance of Inquirer.
The use of the various modules of Disciple-EBR will be introduced with the help of case
studies with associated instructions that will provide detailed guidance. We will illustrate
this process by running the case study stored in the knowledge base called “01-Browse-
Argumentation.” This case study concerns the hypothesis “The cesium-137 canister is
missing from the XYZ warehouse,” which is part of the analysis example discussed in
Section 2.2.1. This case study has three objectives:
Figure 2.20 shows an example of an argumentation or reasoning tree in the interface of the
Reasoner. The left panel shows an abstract view of the entire reasoning tree. It consists of
brief names for the main hypotheses and their assessments, if determined. You can expand
(or show the decomposition of) a hypothesis by clicking on it or on the plus sign (+) on its
left. To collapse a decomposition, click on the minus sign (–). You can expand or
collapse the entire tree under a hypothesis by right-clicking on it and selecting Expand or
Collapse.
When you click on a hypothesis in the left panel, the right panel shows the detailed
description of the reasoning step abstracted in the left panel (see Figure 2.20). If you click
on [HIDE SOLUTIONS] at the top of the window, the agent will no longer display the
solutions/assessments of the hypotheses in the right panel. You may show the solutions
again by clicking on [SHOW SOLUTIONS].
The detailed argumentation from the right panel may be a single decomposition or a
deeper tree. In both cases, the leaves of the tree in the right panel are the detailed
descriptions of the subhypotheses of the hypothesis selected in the left panel (see the
arrows in Figure 2.20 that indicate these correspondences).
A detailed description shows the entire name of a hypothesis. It also shows the
question/answer pair that justifies the decomposition of the hypothesis into subhypoth-
eses. The detailed descriptions with solutions also show the synthesis functions that were
used to obtain those solutions from children solutions in the tree. These functions will be
discussed in Section 4.3.
Notice that, in the actual interface of Disciple-EBR, some of the words appear in bright
blue while others appear in dark blue (this distinction is not clearly shown in this book).
The bright blue words are names of specific entities or instances, such as “cesium-137
canister.” The dark blue words correspond to more general notions or concepts, such as
“evidence.” If you click on such a (blue) word in the right panel, the agent automatically
13:50:24,
.003
.003
13:50:24,
switches to the Description module and displays its description. To view the reasoning tree
again, just click on Reasoner on the top of the window.
Figure 2.21 shows another part of the reasoning tree from Figure 2.20 to be browsed
in this case study. Notice that some of the solutions in the left panel have a yellow
background. This indicates that they are assessments or assumptions made by the user,
as will be discussed in Sections 4.4 and 4.9.
In this case study, you will practice the aforementioned operations. You will first select
the hypothesis “The cesium-137 canister is missing from the XYZ warehouse.” Then you will
browse its analysis tree to see how it is decomposed into simpler hypotheses and how the
assessments of these simpler hypotheses are composed. You will visualize both detailed
descriptions of these decomposition and synthesis operations, as well as abstract ones,
including an abstract view of the entire tree. Then you will visualize the descriptions of the
concepts and instances to which the analysis tree refers. Start by following the instructions
described in Operation 2.1 and illustrated in Figure 2.22.
This case study has also illustrated the following basic operations:
13:50:24,
.003
.003
13:50:24,
Figure 2.21. Views of another part the reasoning tree from Figure 2.20.
80 Chapter 2. Evidence-based Reasoning: Connecting the Dots
5. Click on Select
4. Click on Scen
To browse the entire reasoning tree, step by step, click on the hypotheses in the left
panel, or one of the + and – signs preceding them.
To expand or collapse the entire subtree of a hypothesis, right-click on it and select the
corresponding action.
To view the detailed description of an abstract decomposition of a hypothesis in the left
panel, click on the hypothesis, and the detailed decomposition will be displayed in the
right panel.
To browse a detailed reasoning tree in the right panel click on the + and – signs.
13:50:24,
.003
2.6. Review Questions 81
Close all the workspaces open on the current knowledge bases (case studies) by
clicking on the minus sign (–) to the right of the knowledge base icon containing the
plus sign (+).
Close the opened knowledge bases corresponding to the case study by following the
instructions from Operation 3.3.
This is the first of a sequence of assignments in which you will develop a knowledge-based
agent for analyzing hypotheses in a domain of interest to you. Since you will develop the
knowledge base of the agent, you will have to consider a familiar domain. Select a domain
and illustrate the process of evidence-based reasoning with a diagram such as those in
Figures 2.8, 2.14, 2.26, or 2.27. That is, specify an item of evidence, an abductive reasoning
chain from that evidence to a hypothesis of interest, alternative hypotheses to those from
the reasoning chain, and the analysis of one of these hypotheses.
2.1. Consider the intelligence analysis problem from Section 2.2.1. What might constitute
an item of evidence for the hypothesis that the cesium-137 canister was misplaced?
2.2. What might by an alternative hypothesis for “H5: A dirty bomb will be set off in the
Washington, D.C., area”?
2.3. A terrorist incident occurred two weeks ago in an American city involving consider-
able destruction and some loss of life. After an investigation, two foreign terrorist
groups have been identified as possible initiators of this terrorist action: Group
A and Group B. Which are some hypotheses we could entertain about this event?
2.4. Consider the hypothesis that Group A from Country W was involved in a recent
terrorist incident in an American city. What evidence we might find concerning this
hypothesis?
2.7. Consider the hypothesis that the leadership of Country A is planning an armed
conflict against Country B. You have just obtained a report that says that there has
13:50:24,
.003
82 Chapter 2. Evidence-based Reasoning: Connecting the Dots
2.8. Defendant Dave is accused of shooting a victim Vic. When Dave was arrested
sometime after the shooting he was carrying a 32 caliber Colt automatic pistol.
Let H be the hypothesis that it was Dave who shot Vic. A witness named Frank
appears and says he saw Dave fire a pistol at the scene of the crime when it
occurred; that’s all Frank can tell us. Construct a simple chain of reasoning that
connects Frank’s report to the hypothesis that it was Dave who shot Vic.
2.9. Consider the situation from Question 2.8. The chain of reasoning that connects
Frank’s report to the hypothesis that it was Dave who shot Vic shows only the
possibility of this hypothesis being true. What are some alternative hypotheses?
2.10. Consider again the situation from Questions 2.8 and 2.9. In order to prove the
hypothesis that it was Dave who shot Vic, we need additional evidence. As dis-
cussed in Section 2.2.2, we need to put this hypothesis to work to guide us in
collecting new evidence. Decompose this hypothesis into simpler hypotheses,
as was illustrated by the blue trees in Figures 2.8 and 2.9, in order to discover
new evidence.
2.11. Our investigation described in Questions 2.8, 2.9, and 2.10 has led to the discovery
of additional evidence. By itself, each evidence item is hardly conclusive that Dave
was the one who shot Vic. Someone else might have been using Dave’s Colt
automatic. But Frank’s testimony, along with the fact that Dave was carrying his
weapon and with the ballistics evidence, puts additional heat on Dave. Analyze the
hypothesis that it was Dave who shot Vic, based on all these items of evidence, as
was illustrated by the green trees in Figures 2.8 and 2.9. In Chapter 4, we will
discuss more rigorous methods for making such probabilistic assessments. In this
exercise, just use your common sense.
2.12. A car bomb was set off in front of a power substation in Washington, D.C., on
November 25. The building was damaged but, fortunately, no one was injured.
From the car’s identification plate, which survived, it was learned that the car
belonged to Budget Car Rental Agency. From information provided by Budget, it
was learned that the car was last rented on November 24 by a man named
M. Construct an argument from this evidence to the hypothesis that Person
M was involved in this car-bombing incident.
2.13. Consider again the situation from Question 2.12, and suppose that we have deter-
mined that the evidence that M rented a car on November 24 is believable. We
want now to assess whether M drove the car on November 25. For this, we need
additional evidence. As discussed in Section 2.2.2, we need to put this hypothesis to
work to guide us in collecting new evidence. Decompose this hypothesis into
simpler hypotheses, as was illustrated by the blue trees in Figures 2.8 and 2.9, in
order to discover new evidence.
13:50:24,
.003
Methodologies and Tools for Agent
3 Design and Development
83
13:44:04,
.004
84 Chapter 3. Methodologies and Tools
Table 3.2 Problem to Be Solved with a Knowledge-based Agent (from Buchanan et al.,
p. 132).
The director of the Oak Ridge National Lab (ORNL) faces a problem. Environmental Protection
Agency (EPA) regulations forbid the discharge of quantities of oil or hazardous chemicals into or
upon waters of the United States when this discharge violates specified quality standards. ORNL
has approximately two thousand buildings on a two-hundred-square-mile government
reservation, with ninety-three discharge sites entering White Oak Creek. Oil and hazardous
chemicals are stored and used extensively at ORNL. The problem is to detect, monitor,
and contain spills of these materials, and this problem may be solved with a
knowledge-based agent.
Table 3.3 Specification of the Actual Problem to Be Solved (from Buchanan et al., p.133).
When an accidental inland spill of an oil or chemical occurs, an emergency situation may exist,
depending on the properties and the quantity of the substance released, the location of the
substance, and whether or not the substance enters a body of water.
The observer of a spill should:
1. Characterize the spill and the probable hazards.
2. Contain the spill material.
3. Locate the source of the spill and stop any further release.
4. Notify the Department of Environmental Management.
What issues may concern the subject matter expert? First of all, the expert may be
concerned that once his or her expertise is represented into the agent, the organization
may no longer need him or her because the job can be performed by the agent. Replacing
human experts was a bad and generally inaccurate way of promoting expert systems.
Usually, the knowledge-based agents and even the expert systems are used by experts in
order to better and more efficiently solve problems from their areas of expertise. They are
also used by people who need the expertise but do not have access to a human expert, or
the expert would be too expensive.
What are some examples of knowledge-based agents? Think, for instance, of any tax-
preparation software. Is it a knowledge-based agent? What about the software systems that
help us with various legal problems, such as creating a will? They all are based on large
amounts of subject matter expertise that are represented in their knowledge bases.
Once the subject matter expert is identified and agrees to work on this project, the
knowledge engineer and the expert have a series of meetings to better define the actual
problem to be solved, which is shown in Table 3.3.
The knowledge engineer has many meetings with the subject matter expert to elicit his
or her knowledge on how to solve the specified problem. There are several knowledge
elicitation methods that can be employed, as discussed later in Section 6.3. Table 3.4
illustrates the unstructured interview, where the questions of the knowledge engineer and
the responses of the expert are open-ended.
13:44:04,
.004
3.1. Conventional Design and Development Scenario 85
KE: Suppose you were told that a spill had been detected in White Oak Creek one mile
before it enters White Oak Lake. What would you do to contain the spill?
SME: That depends on a number of factors. I would need to find the source in order to
prevent the possibility of further contamination, probably by checking drains and manholes for
signs of the spill material. And it helps to know what the spilled material is.
KE: How can you tell what it is?
SME: Sometimes you can tell what the substance is by its smell. Sometimes you can
tell by its color, but that's not always reliable since dyes are used a lot nowadays. Oil,
however, floats on the surface and forms a silvery film, while acids dissolve completely in
the water. Once you discover the type of material spilled, you can eliminate any
building that either doesn’t store the material at all or doesn’t store enough of it to account
for the spill.
Table 3.5 Identification of the Basic Concepts and Features Employed by the
Subject Matter Expert
KE: Suppose you were told that a spill had been detected in White Oak Creek one mile
before it enters White Oak Lake. What would you do to contain the spill?
SME: That depends on a number of factors. I would need to find the source in order
to prevent the possibility of further contamination, probably by checking drains
and manholes for signs of the spill material. And it helps to know what the spilled
material is.
KE: How can you tell what it is?
SME: Sometimes you can tell what the substance is by its smell. Sometimes you can
tell by its color, but that's not always reliable since dyes are used a lot nowadays. Oil, however,
floats on the surface and forms a silvery film, while acids dissolve completely in the water. Once
you discover the type of material spilled, you can eliminate any building that either doesn’t store
the material at all or doesn’t store enough of it to account for the spill.
13:44:04,
.004
86 Chapter 3. Methodologies and Tools
object
building subconcept of
odor
appearance
instance of
building 3024 subconcept of
source spill
building 3023 instance of no film
instance of
silver film subconcept of
s6-1 s6-2 spill-1
pungent odor
location substance has as type
subconcept of has as odor
subconcept of
vinegar odor
drain manhole water acid
oil
instance of instance of subconcept of subconcept of
d6-1 d6-2 m6-1 m6-2
diesel oil gasoline sulfuric acid acetic acid
feature
The features are also represented hierarchically, as shown in Figure 3.2. Ontologies are
discussed in detail in Chapter 5.
13:44:04,
.004
3.1. Conventional Design and Development Scenario 87
SME: Sometimes you can tell what the substance is by its smell. Sometimes you can tell by its color,
but that’s not always reliable since dyes are used a lot nowadays. Oil, however, floats on the
surface and forms a silvery film, while acid dissolves completely in the water.
IF the spill . . .
THEN the substance of the spill is oil
IF the spill . . .
THEN the substance of the spill is acid
Table 3.7 Iterative Process of Rules Development and Refinement (Based on Buchanan et al.,
p.138)
KE: Here are some rules I think capture your explanation about determining the substance of the
spill. What do you think?
IF the spill does not dissolve in water
and the spill forms a silvery film
THEN the substance of the spill is oil
IF the spill dissolves in water
and the spill does not form a film
THEN the substance of the spill is acid
SME: Uh-huh (long pause). Yes, that begins to capture it. Of course, if the substance is silver nitrate,
it will dissolve only partially in water.
KE: I see. Rather than talking about a substance dissolving or not dissolving in water, we should talk
about its solubility, which we may consider as being high, moderate, or low. Let’s add that
information to the knowledge base and see what it looks like.
IF the solubility of the spill is low
and the spill forms a silvery film
THEN the substance of the spill is oil
IF the solubility of the spill is moderate
THEN the substance of the spill is silver-nitrate
13:44:04,
.004
88 Chapter 3. Methodologies and Tools
Testing of the agent involves three types of activity: verification, validation, and certifi-
cation (O’Keefe et al., 1987; Awad, 1996).
In essence, verification attempts to answer the question: Are we building the agent right?
Its goal is to test the consistency and the completeness of the agent with respect to its initial
specification. For example, in the case of a rule-based agent, one would check the rules to
identify various types of errors, such as the existence of rules that are redundant, conflicting,
subsumed, circular, dead-end, missing, unreachable, or with unnecessary IF conditions.
Validation, on the other hand, attempts to answer the question: Are we building the
right agent? In essence, this activity checks whether the agent meets the user’s needs and
requirements.
Finally, certification is a written guarantee that the agent complies with its specified
requirements and is acceptable for operational use.
Various types of tools can be used to develop a knowledge-based agent. We will briefly
discuss three different types: expert system shells, learning agent shells, and learning agent
shells for evidence-based reasoning. We will also discuss the reuse of knowledge in the
development of a knowledge-based agent.
OPS (Cooper and Wogrin, 1988), which has a general rule engine
CLIPS (Giarratano and Riley, 1994), which also has a general rule engine
CYC (Lenat, 1995; CYC, 2008; 2016), a very large knowledge base with ontologies
covering many domains, and with several rule engines
EXPECT (Gil and Paris, 1995; EXPECT 2015), a shell that enables the acquisition of
problem-solving knowledge both from knowledge engineers and from end-users
13:44:04,
.004
3.2. Development Tools and Reusable Ontologies 89
JESS (Friedman-Hill, 2003; JESS, 2016), which is a version of CLIPS with a Java-based
rule engine
CommonKADS (Schreiber et al., 2000), which is a general methodology with support-
ing tools for the development of knowledge-based systems
Jena, which is a toolkit for developing applications for the Semantic Web (Jena, 2012)
Pellet, an ontology (OWL2) reasoner that can be used to develop knowledge-based
applications for the Semantic Web (Pellet, 2012)
Protégé (Musen, 1989; Protégé, 2015), an ontology editor and knowledge base frame-
work, also used to develop knowledge-based applications for the Semantic Web
TopBraid Composer (Allemang and Hendler, 2011; TopBraid Composer, 2012) Ontology
Development Tool for Semantic Web applications
At the power end of the spectrum are shells that employ much more specific problem-
solving methods, such the propose-and-revise design method used in SALT (Marcus, 1988)
to design elevators. The knowledge for such a system can be elicited by simply filling in
forms, which are then automatically converted into rules. Thus the shell provides signifi-
cant assistance in building the system, but the type of systems for which it can be used is
much more limited.
In between these two types of shells are the shells applicable to a certain type of
problems (such as planning, or diagnosis, or design). A representative example is EMYCIN
(van Melle et al., 1981), a general rule-based shell for medical diagnosis.
13:44:04,
.004
90 Chapter 3. Methodologies and Tools
13:44:04,
.004
3.2. Development Tools and Reusable Ontologies 91
base structured into an ontology and a set of rules (see Figure 3.3). Building a knowledge-
based agent for a specific application consists of customizing the shell for that application
and developing the knowledge base. The learning engine facilitates the building of the
knowledge base by subject matter experts and knowledge engineers.
Examples of learning agent shells are Disciple-LAS (Tecuci et al., 1999), Disciple-COA
(Tecuci et al., 2001), and Disciple-COG/RKF (Tecuci et al., 2005b), the last two being
presented in Section 12.4.
Problem
Interface
Solving
Ontology
+ Rules
Learning
13:44:04,
.004
92 Chapter 3. Methodologies and Tools
Problem EBR KB
Solving
Interface
Domain KB Domain KB Domain KB
Scenario KB Scenario KB
Learning
KB = Ontology + Rules
Figure 3.4. The overall architecture of a learning agent shell for evidence-based reasoning.
1. Shell
Customization
Disciple Developer and Knowledge Engineer
Disciple
Learning Agent Multi-
Disciple Ontology
Shell Strategy
Development
Learning
Modules
Modules
EBR KB
Mixed- Tutoring
Initiative Domain KB Domain KB Domain KB
Modules
Interaction
Scenario KB Scenario KB
Knowledge Base
Management
End
-
Use
r
Problem- Evidence-
Solving Specific
Modules Specialized Modules
Agent
End-User
4. Field Use
where, based on the specification of the type of problems to be solved and the agent to be
built, the developer and the knowledge engineer may decide that some extensions of the
Disciple shell may be necessary or useful. It is through such successive extensions during
the development of Disciple agents for various applications that the current version of the
Disciple shell for evidence-based reasoning problems (which includes the EBR knowledge
base) has emerged.
The next stage is agent teaching by the subject matter expert and the knowledge
engineer, supported by the agent itself, which simplifies and speeds up the knowledge
base development process (Tecuci et al., 2001; 2002b; 2005b). Once an operational agent is
developed, it is used for the education and training of the end-users, possibly in a
classroom environment.
The fourth stage is field use, where copies of the developed agent support users in their
operational environments. During this stage, an agent assists its user both in solving
13:44:04,
.004
3.3. System Design and Development 93
problems and in collaborating with other users and their cognitive assistants. At the same
time, it continuously learns from this problem-solving experience by employing a form of
nondisruptive learning. In essence, it learns new rules from examples. However, because
there is no learning assistance from the user, the learned rules will not include a formal
applicability condition. It is during the next stage of after action review, when the user and
the agent analyze past problem-solving episodes, that the formal applicability conditions
are learned based on the accumulated examples.
In time, each cognitive assistant extends its knowledge with expertise acquired from its
user. This results in different agents and creates the opportunity to develop a more
competent agent by integrating the knowledge of all these agents. This can be accom-
plished by a knowledge engineer, with assistance from a subject matter expert, during the
next stage of knowledge integration. The result is an improved agent that may be used in a
new iteration of a spiral process of development and use.
13:44:04,
.004
94 Chapter 3. Methodologies and Tools
Table 3.8 Some Relevant Questions to Consider When Assessing a Potential PhD Advisor
(1) What is the reputation of the PhD advisor within the professional community at large?
(2) Does the advisor have many publications?
(3) Is his or her work cited?
(4) What is the opinion of the peers of this PhD advisor?
(5) What do the students think about this PhD advisor?
(6) Is the PhD advisor likely to remain on the faculty for the duration of your degree program?
(7) What is the placement record of the students of this PhD advisor? Where do they get jobs?
(8) Is the PhD advisor expert in your areas of interest?
(9) Does the PhD advisor publish with students?
(10) Does the PhD advisor have a research group or merely a string of individual students?
(11) Is the PhD advisor’s research work funded?
An analysis of the questions in Table 3.8 shows that some of them point to necessary
conditions that need to be satisfied by the PhD advisor, while others refer to various
desirable qualities. Which questions from Table 3.8 point to necessary conditions? The
answers to questions (6) and (8) need to be “yes” in order to further consider a potential
PhD advisor.
Now let us consider the desirable qualities of a PhD advisor revealed by the other
questions in Table 3.8. Some of these qualities seem to be more closely related than
others. It would be useful to organize them in classes of quality criteria. Could you identify
a class of related criteria? Questions (2), (3), (4), and (11) all characterize aspects of the
professional reputation of the advisor.
What might be other classes of related criteria suggested by the questions in Table 3.8?
Questions (7) and (9) characterize the results of the students of the PhD advisor, while
questions (5) and (10) characterize their learning experience.
13:44:04,
.004
3.3. System Design and Development 95
criterion
instance of
has as criterion
13:44:04,
.004
96 Chapter 3. Methodologies and Tools
They express the hypothesis in natural language and select the phrases that may be
different for other similar hypotheses, such as the names of the advisor and student.
The selected phrases will appear in blue, guiding the agent to learn a general hypothesis
pattern:
This top-level hypothesis will be successively reduced to simpler and simpler hypotheses,
guided by questions and answers, as shown in Figure 3.8 and discussed in this section.
Bob Sharp is John Doe will stay on the faculty John Doe would be a
interested in an of George Mason University for good PhD advisor with
area of expertise the duration of the PhD respect to the PhD
of John Doe. dissertation of Bob Sharp. advisor quality criterion.
Is Bob Sharp interested in an Which are the necessary quality criteria for a good PhD advisor?
area of expertise of John Doe?
professional reputation criterion, quality of student results criterion,
Yes, Artificial Intelligence. and student learning experience criterion.
It is certain that Bob Sharp John Doe would be a good John Doe would be a John Doe would be a good
is interested in an area of PhD advisor with respect good PhD advisor with PhD advisor with respect
expertise of John Doe. to the professional respect to the quality of to the student learning
reputation criterion. student results criterion. experience criterion.
13:44:04,
.004
3.3. System Design and Development 97
The reductions of the subhypotheses continues in the same way, until solutions are
obtained for them:
John Doe would be a good PhD advisor with respect to the PhD advisor quality criterion.
Which are the necessary quality criteria for a good PhD advisor?
professional reputation criterion, quality of student results criterion, and student learning
experience criterion.
John Doe would be a good PhD advisor with respect to the professional reputation
criterion.
John Doe would be a good PhD advisor with respect to the quality of student results
criterion.
John Doe would be a good PhD advisor with respect to the student learning experience
criterion.
Each of these subhypotheses can now be reduced to simpler hypotheses, each corres-
ponding to one of the elementary criteria from the right side of Figure 3.7 (e.g.,
research funding criterion). Since each of these reductions reduces a criterion to a
subcriterion, the agent could be asked to learn a general reduction pattern, as shown
in Figure 3.9.
Why is pattern learning useful? One reason is that the pattern can be applied
to reduce a criterion to its subcriteria, as shown in Figure 3.10. Additionally, as will
be illustrated later, the pattern will evolve into a rule that will automatically generate all
the reductions of criteria to their sub-criteria. If, instead of learning a pattern and
13:44:04,
.004
98 Chapter 3. Methodologies and Tools
Which are the necessary quality criteria for a good PhD advisor?
professional reputation criterion, quality of student results criterion,
and student learning experience criterion.
John Doe would be a good John Doe would be a good John Doe would be a good
?O1 would be a good PhD advisor with respect to PhD advisor with respect PhD advisor with respect to
PhD advisor with the professional reputation to the quality of student the student learning
respect to the ?O2. criterion. results criterion. experience criterion.
Pattern instantiation
Which is a ?O2? Which is a quality of student results criterion? Which is a quality of student results criterion?
?O3 publications with advisor criterion employers of graduates criterion
?O1 would be a good John Doe would be a good PhD John Doe would be a good PhD
PhD advisor with advisor with respect to the advisor with respect to the
respect to the ?O3. publications with advisor criterion. employers of graduates criterion.
applying it, the user would manually define these reductions, then any syntactic differ-
ences between these reductions would lead to the learning of different rules. These
rules would only be superficially different, leading to an inefficient and difficult to
maintain agent.
After the top-level criterion (i.e., PhD advisor quality criterion) is reduced to a set of
elementary criteria, specific knowledge and evidence about the advisor need to be used
to evaluate John Doe with respect to each such elementary criterion. For example, the
following hypothesis will be evaluated based on favoring and disfavoring evidence from
John Doe’s peers:
John Doe would be a good PhD advisor with respect to the peer opinion criterion.
A learning agent shell for evidence-based reasoning already knows how to assess such
hypotheses based on evidence.
Through this process, the initial hypothesis is reduced to elementary hypotheses for
which assessments are made. Then these assessments are successively combined, from
bottom-up, until the assessment of the initial hypothesis is obtained, as illustrated in
Figure 3.11.
Notice at the bottom-right side of Figure 3.11 the assessments corresponding to the
subcriteria of the quality of student results criterion:
It is likely that John Doe would be a good PhD advisor with respect to the publications
with advisor criterion.
It is very likely that John Doe would be a good PhD advisor with respect to the employers
of graduates criterion.
It is very likely that John Doe would be a good PhD advisor with respect to the quality of
student results criterion.
13:44:04,
.004
3.3. System Design and Development 99
min
Which are the necessary conditions?
Bob Sharp should be interested in an area of expertise of John Doe, who should stay on
the faculty of George Mason University for the duration of the PhD dissertation of Bob
Sharp and should have the qualities of a good PhD advisor.
Bob Sharp is John Doe will stay on the faculty John Doe would be a
interested in an of George Mason University for good PhD advisor with
area of expertise of the duration of the PhD respect to the PhD
John Doe. dissertation of Bob Sharp. advisor quality criterion.
certain almost certain very likely
min
Is Bob Sharp interested in an Which are the necessary quality criteria for a good PhD advisor?
area of expertise of John Doe?
professional reputation criterion, quality of student results criterion,
Yes, Artificial Intelligence. and student learning experience criterion.
It is certain that Bob Sharp John Doe would be a good John Doe would be a John Doe would be a good
is interested in an area of PhD advisor with respect to good PhD advisor with PhD advisor with respect to
expertise of John Doe. the professional reputation respect to the quality of the student learning
criterion. student results criterion. experience criterion.
very likely very likely almost certain
max
Which is a quality of student results criterion? Which is a quality of student results criterion?
publications with advisor criterion employers of graduates criterion
John Doe would be a good PhD John Doe would be a good PhD
advisor with respect to the advisor with respect to the
publications with advisor criterion. employers of graduates criterion.
likely very likely
Figure 3.11. Reduction and synthesis tree for assessing a specific hypothesis.
Then this assessment is combined with the assessments corresponding to the other major
criteria (very likely for the professional reputation criterion, and almost certain for the student
learning experience criterion), through a minimum function (because they are necessary
conditions), to obtain the assessment very likely for the PhD advisor quality criterion.
Finally, consider the assessments of the three subhypotheses of the top-level hypothesis:
These assessments are combined by taking their minimum, leading to the following
assessment of the initial hypothesis:
It is very likely that John Doe would be a good PhD advisor for Bob Sharp.
Could you justify the preceding solution synthesis function? We used minimum because
each of the three subhypotheses of the initial hypothesis corresponds to a necessary
condition. If any of them has a low probability, we would like this to be reflected in the
overall evaluation.
13:44:04,
.004
100 Chapter 3. Methodologies and Tools
Notice that, at this point, the knowledge engineer and the subject matter expert have
completely modeled the assessment of the specific hypothesis considered. This is the most
creative and the most challenging part of developing the agent. Once such a model for
assessing hypotheses (or solving problems, in general) is clarified, the agent can be rapidly
prototyped by modeling a set of typical hypotheses. The rest of the agent development
process consists of developing its knowledge base so that the agent can automatically
assess other hypotheses. The knowledge base will consist of an ontology of domain
concepts and relationships and of problem/hypothesis reduction and solution synthesis
rules, as was discussed in Section 1.6.3.1 and illustrated in Figure 1.15 (p. 38). As will be
discussed in the following, the way the preceding assessments were modeled will greatly
facilitate this process.
13:44:04,
.004
3.3. System Design and Development 101
The goal of this phase is to develop an ontology that is as complete as possible. This will
enable the agent to learn reasoning rules based on the concepts and the features from the
ontology, as will be briefly illustrated in the following.
area of expertise
subconcept of
instance of
Bob Sharp
is interested in
Artificial Intelligence Software Engineering
is expert in
John Doe
Figure 3.13. Expanded ontology based on the specification from Figure 3.12.
Is Bob Sharp interested in an Which are the necessary quality criteria for a good PhD advisor?
area of expertise of John Doe?
professional reputation criterion, quality of student results criterion,
Yes, Artificial Intelligence. and student learning experience criterion.
It is certain that Bob Sharp is John Doe would be a good John Doe would be a good John Doe would be a good
interested in an area of PhD advisor with respect to PhD advisor with respect PhD advisor with respect to
expertise of John Doe. the professional reputation to the quality of student the student learning
criterion. results criterion. experience criterion.
Which is a professional
reputation criterion?
research funding criterion
13:44:04,
.004
102 Chapter 3. Methodologies and Tools
Bob Sharp is
interested an area of
expertise of John Doe.
indicates the conditions that need to be satisfied by these variables so that the IF
hypothesis can be assessed as indicated in the example. For example, ?O1 should be a
PhD student or possibly a person (the agent does not yet know precisely what concept to
apply because the rule is only partially learned), ?O1 should be interested in ?O3, and ?O3
should be Artificial Intelligence or possibly any area of expertise.
The way the question and its answer from the reduction step are formulated is very
important for learning. What could the agent learn if the answer were simply “yes”? The
agent would only be able to learn the fact that “Bob Sharp is interested in an area of
expertise of John Doe.” By providing an explanation of why this fact is true (“Yes, Artificial
Intelligence” meaning: “Yes, because Bob Sharp is interested in Artificial Intelligence which is
an area of expertise of John Doe”), we help the agent to learn a general rule where it will
check that the student ?O1 is interested in some area ?O3, which is an area of expertise of
the advisor ?O2. This is precisely the condition of the rule that can be easily verified
because this type of knowledge was represented in the ontology, as discussed previously
and shown in Figure 3.12.
What is the difference between the pattern learning illustrated in Figure 3.9 and the rule
learning illustrated in Figure 3.15? The difference is in the formal applicability condition of
the rule, which restricts the possible values of the rule variables and allows the automatic
application of the rule in situations where the condition is satisfied. A learned pattern,
such as that from Figure 3.9, cannot be automatically applied because the agent does not
know how to instantiate its variables correctly. Therefore, its application, during the
modeling phase, is controlled by the user, who selects the instances of the variables.
A remarkable capability of the agent is that it learns a general rule, like the one in
Figure 3.15, from a single example rather than requiring the rule to be manually developed
by the knowledge engineer and the subject matter expert. Rule learning will be discussed
in detail in Chapter 9.
13:44:04,
.004
3.3. System Design and Development 103
As indicated in the preceding, the rule in Figure 3.15 is only partially learned, because
instead of an exact applicability condition, it contains an upper and a lower bound for this
condition. The upper bound condition (represented as the larger ellipse from Figure 3.15)
corresponds to the most general generalization of the example (represented as the point
from the center of the two ellipses) in the context of the agent’s ontology, which is used as
a generalization hierarchy for learning. The lower bound condition (represented as the
smaller ellipse) corresponds to the least general generalization of the example.
The next phase is to refine the learned rules and, at the same time, test the agent with
new hypotheses. Therefore, the subject matter expert will formulate new hypotheses,
for example:
Using the learned rules, the agent will automatically generate the reasoning tree from
Figure 3.16. Notice that, in this case, the area of common interest/expertise of Dan Smith
and Bob Sharp is Information Security. The expert will have to check each reasoning step.
Those that are correct represent new positive examples that are used to generalize the
lower bound conditions of the corresponding rules. Those that are incorrect are used as
negative examples. The expert will interact with the agent to explain to it why a reasoning
step is incorrect, and the agent will correspondingly specialize the upper bound condition
of the rule, or both conditions. During the rule refinement process, the rule’s conditions
will converge toward one another and toward the exact applicability condition. During this
process, the ontology may also be extended, for instance to include the new concepts used
to explain the agent’s error. Rule refinement will be discussed in detail in Chapter 10.
Bob Sharp is Dan Smith will stay on the faculty Dan Smith would be a
interested in an of George Mason University for good PhD advisor with
area of expertise the duration of the PhD respect to the PhD advisor
of Dan Smith. dissertation of Bob Sharp. quality criterion.
Is Bob Sharp interested in an Which are the necessary quality criteria for a good PhD advisor?
area of expertise of Dan Smith?
professional reputation criterion, quality of student results criterion,
Yes, Information Security. and student learning experience criterion.
It is certain that Bob Sharp Dan Smith would be a good Dan Smith would be a good Dan Smith would be a good
is interested in an area of PhD advisor with respect to PhD advisor with respect to PhD advisor with respect to
expertise of Dan Smith. the professional reputation the quality of student the student learning
criterion. results criterion. experience criterion.
13:44:04,
.004
104 Chapter 3. Methodologies and Tools
13:44:04,
.004
3.3. System Design and Development 105
P1 S1
P1 S1
QuestionP1 S1
Question
Answer
Question
Answer
Question
Answer
Question
Rapid
Answer
Question
Answer
Answer IF the problem to solve is P1g
P1 S1 P 1 S1 IF the problem to solve is P1g
1 1 n n IF the problem to solve is P1g
THEN solve its sub-problemsIF the problem to solve is P1g
P11 S11 Pn1 S1n
prototyping
THEN solve its sub-problems
P11 S11 Pn1 S1n 1 … P
P1g 1
THEN solve its sub-problems
Question
Png
1 … P 1
THEN solve its sub-problems
1g Png
1 … P 1
Question
Answer Examples: E1, … , Ek 1g Png
1 … P1
ng
Question
Answer Examples: E1, … , Ek 1g
Question
Answer Examples: E1, … , Ek
Question Examples: E1, … , Ek
Answer
Question
Answer
P21 S2 Answer
2 2
1 Pm Sm
P21 S21 2
Pm 2
Sm
P21 S21 2
Pm 2
Sm
actor
Ontology person
organization
graduate educational
faculty member staff member company
student organization
actor
P1 S1
P1 S1
QuestionP1 S1
Question
Answer IF the problem to solve is P1g
Question
Answer IF the problem to solve is P1g
Question
Answer
Question IF the problem to solve is P1g
IF the problem to solve is P1g
person
IF the problem to solve is P1g organization
IF the problem to solve is P1g
IF the problem to solve is P1g
Condition IF the problem to solve is P1g
Condition IF the problem to solve is P1g
Condition IF the problem to solve is P1g university employee student
Except-When Condition Condition IF the problem to solve is P1g
IF the problem to solve is P1g
Agent
… Except-When
… Except-When
Condition Condition
Condition Condition educational
… Except-When
Except-When Condition
… Except-When
Condition Condition
Condition faculty member staff member organization
company
Except-When Condition Condition
… Except-When
Except-When Condition Condition graduate undergraduate
… Except-When Condition
optimization
THEN solve its sub-problemsExcept-When Condition
… Except-When Condition
student student
THEN solve its sub-problems Except-When Condition
…
1 … P1
P1g THEN solve its sub-problemsExcept-When Condition
Png
1 … P 1
THEN solve its sub-problemsExcept-When Condition instructor professor PhD advisor
1g Png
1 … P 1
THEN solve its sub-problemsExcept-When Condition
1g Png
1 … P 1
THEN solve its sub-problems BS student college university
1g Png
1 … P 1
THEN solve its sub-problems
1g Png
1 … P 1
THEN solve its sub-problems
1g Png
1 … P
1g
1
ng
1 1 assistant associate full
P1g … Png professor professor professor
MS student PhD student
Figure 3.18. Main phases of agent development when using learning technology.
13:44:04,
.004
106 Chapter 3. Methodologies and Tools
Successively reducing it, from the top down, to simpler and simpler hypotheses
Assessing the simplest hypotheses
Successively combining, from the bottom up, the assessments of the subhypotheses,
until the assessment of the top-level hypothesis is obtained
13:44:04,
.004
3.4. Hands On 107
Periodically, the agent can undergo an optimization phase, which is the last phase in
Figure 3.18. During this phase, the knowledge engineer and the subject matter expert will
review the patterns learned from the end-user, will learn corresponding rules from them,
and will correspondingly refine the ontology. The current version of the Disciple-EBR shell
reapplies its rule and ontology learning methods to do this. However, improved methods
can be used when the pattern has more than one set of instances, because each represents
a different example of the rule to be learned. This is part of future research.
Figure 3.19 compares the conventional knowledge engineering process of developing a
knowledge-based agent (which was discussed in Section 3.1) with the process discussed in
this section, which is based on the learning agent technology.
The top part of Figure 3.19 shows the complex knowledge engineering activities that are
required to build the knowledge base. The knowledge engineer and the subject matter
expert have to develop a model of the application domain that will make explicit the way
the subject matter expert assesses hypotheses. Then the knowledge engineer has to
develop the ontology. He or she also needs to define general hypotheses decomposition
rules and to debug them, with the help of the subject matter expert.
As shown at the bottom of Figure 3.19, each such activity is replaced with an equivalent
activity that is performed by the subject matter expert and the agent, with limited assistance
from the knowledge engineer. The knowledge engineer still needs to help the subject matter
expert to define a formal model of how to assess hypotheses and to develop the ontology.
After that, the subject matter expert will teach the agent how to assess hypotheses, through
examples and explanations, and the agent will learn and refine the rules by itself.
The next chapters discuss each phase of this process in much more detail.
The knowledge bases developed with Disciple-EBR are located in the repository folder,
which is inside the installation folder. As shown in Figure 3.17, the knowledge bases are
Subject
ct Dialogue
Conventional Knowledge
Knowledge Matter
er
rtt
Expert Engineer
Engineering Programming
Model Define Verify and
Understand Develop
problem reasoning update rules
domain ontology
solving rules and ontology
D
Develop Define concepts Provide and Analyze
Explain
reasoning with hierarchical explain agent’s
errors
model organization examples solutions
Knowledge
Engineer Assist in Extend ontology Learn Learn
Refine
(support role)) solution based on modeled reasoning ontology
rules
development solutions rules elements
Learning-based Subject Mixed-Initiative
Knowledge Matter Dialogue
COGNITIVE ASSISTANT
Engineering Expert
13:44:04,
.004
108 Chapter 3. Methodologies and Tools
organized hierarchically, with the knowledge base for evidence-based reasoning at the top
of the hierarchy. The user cannot change this knowledge base, whose knowledge elements
are inherited in the domain and scenario knowledge bases. From the user’s point of view,
each knowledge base consists of a top-level domain part (which contains knowledge
common to several applications or scenarios in a domain) and one scenario part (con-
taining knowledge specific to a particular application or scenario). As illustrated in
Figure 3.17, there can be more than one scenario under a domain. In such a case, the
domain and each of the scenarios correspond to a different knowledge base. Loading,
saving, or closing a scenario will automatically load, save, or close both the scenario part
and the corresponding domain part of the knowledge base.
Loading and selecting a knowledge base are described in Operation 3.1 and illustrated
in Figure 3.20.
5. Click on Select
4. Click on Scen
13:44:04,
.004
3.4. Hands On 109
Once a knowledge base is selected, you can invoke different modules of Disciple-EBR to
use it. Each module is accessible in a specific workspace. As illustrated in Figure 3.21,
there are three workspaces:
The user can switch between the workspaces to use the corresponding modules. For
example, you must switch to the Evidence workspace to work with evidence items. Then,
to save the knowledge base, you must switch to the Scenario workspace.
The user should work with only one set of the three workspaces at a time (correspond-
ing to the same KB). Therefore, you should close the workspaces corresponding to a
knowledge base (by clicking on the icon with the minus sign [–]) before opening the
workspaces corresponding to another knowledge base.
The steps to save all the knowledge bases loaded in memory are described in Operation
3.2 and illustrated in Figure 3.22.
Scenario workspace
Domain workspace
Evidence workspace
2. Scenario workspace
3. Select System
Save All
13:44:04,
.004
110 Chapter 3. Methodologies and Tools
13:44:04,
.004
3.6. Project Assignment 2 111
The following are knowledge engineering guidelines for knowledge base development.
Guideline 3.1. Work with only one knowledge base loaded in memory
To maintain the performance of the Disciple-EBR modules, work with only one knowledge
base loaded in memory. Therefore, close all the knowledge bases before loading a new one.
If several knowledge bases are loaded, work only with the set of three workspaces
corresponding to the knowledge base you are currently using. Close all the other
workspaces.
“WA” has errors, and the most recently saved version is “WA-4o.”
Delete “WA” in Windows Explorer, copy “WA-4o,” and rename this copy as “WA.”
Continue with the development of “WA.”
Finalize the project team, specify the type of hypotheses to be analyzed by the agent to
be developed, and study the application domain. Prepare a short presentation of the
following:
The application domain of your agent and why your agent is important.
A bibliography containing the expertise domain and your current familiarity with the
domain, keeping in mind that you should choose a domain where you already are or
could become an expert without investing a significant amount of time.
Three examples of hypotheses and a probability for each.
13:44:04,
.004
112 Chapter 3. Methodologies and Tools
3.1. Which are the main phases in the development of a knowledge-based agent?
3.5. Consider the fact that a knowledge-based agent may need to have hundreds or
thousands of rules. What can be said about the difficulty of defining and refining
these rules through the conventional process discussed in Section 3.1.4?
3.6. Use the scenario from Section 3.1 to illustrate the different difficulties of building a
knowledge-based agent discussed in Section 1.6.3.1.
3.11. What is the organization of the knowledge repository of a learning agent shell for
evidence-based reasoning?
3.12. What is the difference between a specific instance and a generic instance? Provide
an example of each.
3.13. Are there any mistakes in the reasoning step from Figure 3.24 with respect to the
goal of teaching the agent? If the answer is yes, explain and indicate corrections.
3.14. Which are the main stages of developing a knowledge-based agent using learning
agent technology?
13:44:04,
.004
Modeling the Problem-Solving
4 Process
Analysis and synthesis, introduced in Section 1.6.2, form the basis of a general divide-and-
conquer problem-solving strategy that can be applied to a wide variety of problems. The
general idea, illustrated in Figure 4.1, is to decompose or reduce a complex problem P1 to
n simpler problems P11, P12, . . . , P1n, which represent its components. If we can then find
the solutions S11, S12, . . . , S1n of these subproblems, then these solutions can be combined
into the solution S1 of the problem P1.
If any of the subproblems is still too complex, it can be approached in a similar way, by
successively decomposing or reducing it to simpler problems, until one obtains problems
whose solutions are known, as illustrated in Figure 4.2.
Figures 4.3 and 4.4 illustrate the application of this divide-and-conquer approach to
solve a symbolic integration problem. Notice that each reduction and synthesis operation
is justified by a specific symbolic integration operator.
Cognitive assistants require the use of problem-solving paradigms that are both natural
enough for their human users and formal enough to be automatically executed by the
agents. Inquiry-driven analysis and synthesis comprise such a problem-solving paradigm
where the reduction and synthesis operations are guided by corresponding questions and
answers. The typical questions are those from Rudyard Kipling’s well-known poem “I Keep
Six Honest . . .”
P1 S 1
113
13:52:06,
.005
114 Chapter 4. Modeling the Problem-Solving Process
A complex problem P1 is P1 S1
solved by:
• Successively reducing
it to simpler and
simpler problems;
P11 S11 P12 S12 … P1n S1n
• Finding the solutions
of the simplest
problems;
• Successively
combining these P1n1 S1n1 … P1nm S1nm
solutions, from
bottom up, until the
solution of the initial
problem is obtained
(synthesized). P1n11 S1n11 … P1n1q S1n1q
13:52:06,
.005
4.2. Inquiry-Driven Analysis and Synthesis 115
Problem 1
Question Q Question Q
Answer A Answer B
We have already illustrated this paradigm in Sections 2.2 and 3.3.2. To better understand
it, let us consider the simple abstract example from Figure 4.5. To solve Problem 1, one
asks Question Q related to some aspect of Problem 1. Let us assume that there are two
answers to Q: Answer A and Answer B. For example, the question, “Which is a sub-
criterion of the quality of student results criterion?” has two answers, “publications with
advisor criterion” and “employers of graduates criterion.”
Let us further assume that Answer A leads to the reduction of Problem 1 to two simpler
problems, Problem 2 and Problem 3. Similarly, Answer B leads to the reduction of
Problem 1 to the other simpler problems, Problem 4 and Problem 5.
Let us now assume that we have obtained the solutions of these four subproblems. How
do we combine them to obtain the solution of Problem 1? As shown in Figure 4.6, first the
13:52:06,
.005
116 Chapter 4. Modeling the Problem-Solving Process
Problem 1
Solution
of Problem 1
Problem-
level
Question Q synthesis
Question Q
Answer A Answer B
Solution A Solution B
of Problem 1 of Problem 1
Reduction-
level
synthesis
Problem 2 Problem 3 Problem 4 Problem 5
Solution Solution Solution Solution
of Problem 2 of Problem 3 of Problem 4 of Problem 5
Figure 4.6. A more detailed view of the analysis and synthesis process.
Test whether President Roosevelt has the critical capability to maintain support.
Which are the critical requirements for President Roosevelt to maintain support?
President Roosevelt needs means to secure support from the government, means to
secure support from the military, and means to secure support from the people.
Test whether President Roosevelt Test whether President Roosevelt Test whether President Roosevelt
has means to secure support from has means to secure support from has means to secure support from
the government. the military. the people.
13:52:06,
.005
4.2. Inquiry-Driven Analysis and Synthesis 117
Test whether President Roosevelt has the critical capability to maintain support.
President Roosevelt has the critical capability to maintain support because President
Roosevelt has means to secure support from the government, has means to secure
support from the military, and has means to secure support from the people.
critical capability is
an emerging property
Which are the critical requirements for President President Roosevelt has the critical capability to
Roosevelt to maintain support? President Roosevelt maintain support because President Roosevelt
needs means to secure support from the government, has means to secure support from the government,
means to secure support from the military, and means has means to secure support from the military, and
to secure support from the people. has means to secure support from the people.
Test whether President Roosevelt has Test whether President Roosevelt Test whether President Roosevelt
means to secure support from the has means to secure support from has means to secure support from
government. the military. the people.
President Roosevelt has means to President Roosevelt has means to President Roosevelt has means to
secure support from the government. secure support from the military. secure support from the people.
Figure 4.8. Illustration of reduction and synthesis operations in the COG domain.
Moreover, it is assumed that the questions guiding the synthesis operations may have
only one answer, which typically indicates how to combine the solutions. Allowing more
questions and more answers in the synthesis tree would lead to a combinatorial explo-
sion of solutions.
Another interesting aspect is that the three leaf solutions in Figure 4.8 are about the
means of President Roosevelt, while their composition is about a capability. Thus this
illustrates how synthesis operations may lead to emerging properties.
A third aspect to notice is how the reduction-level composition is actually performed. In
the example from Figure 4.8, the solutions to combine are:
President Roosevelt has the critical capability to maintain support because President
Roosevelt has means to secure support from the government, has means to secure
support from the military, and has means to secure support from the people.
13:52:06,
.005
118 Chapter 4. Modeling the Problem-Solving Process
Figure 4.9 shows another example of reduction and synthesis in the COG domain. In
this case, the solutions to combine are:
PM Mussolini does not have the critical capability to maintain support because PM
Mussolini does not have means to secure support from the people.
Additional examples of solution synthesis from the COG domain are presented in
Figures 12.27 (p. 372), 12.28 (p. 373), 12.29 (p. 373), and 12.30 (p. 374) from Section 12.4.2.
As suggested by the preceding examples, there are many ways in which solutions may
be combined.
One last important aspect related to problem solving through analysis and synthesis is
that the solutions of the elementary problems may be obtained by applying any other type of
reasoning strategy. This enables the solving of problems through a multistrategy approach.
Chapter 12 presents Disciple cognitive assistants for different types of tasks, illustrating
the use of the inquiry-driven analysis and synthesis in different domains. Section 12.2
discusses this problem-solving paradigm in the context of military engineering planning.
Section 12.3 discusses it in the context of course of action critiquing. Section 12.4 discusses
it in the context of center of gravity analysis, and Section 12.5 discusses it in the context of
collaborative emergency response planning.
Which are the critical requirements for PM Mussolini to PM Mussolini does not have the critical
maintain support? PM Mussolini needs means to secure capability to maintain support because PM
support from the government, means to secure support from Mussolini does not have means to secure
the military, and means to secure support from the people. support from the people.
Test whether PM Mussolini has Test whether PM Mussolini has Test whether PM Mussolini has
means to secure support from the means to secure support from means to secure support from the
government. the military. people.
PM Mussolini has means to secure PM Mussolini has means to PM Mussolini does not have means
support from the government. secure support from the military. to secure support from the people.
Figure 4.9. Another illustration of reduction and synthesis operations in the COG domain.
13:52:06,
.005
4.3. Evidence-based Reasoning 119
Successively reducing it, from the top down, to simpler and simpler hypotheses
(guided by introspective questions and answers).
Assessing the simplest hypotheses based on evidence.
Successively combining, from the bottom up, the assessments of the simpler hypoth-
eses, until the assessment of the top-level hypothesis is obtained.
Figure 4.10 shows a possible analysis of the hypothesis that Country X has nuclear
weapons.
max
max
Which is an indicator?
Capability to produce enriched uranium.
likely
very likely indicator
Figure 4.10. An example of different types of reductions and corresponding synthesis functions.
13:52:06,
.005
120 Chapter 4. Modeling the Problem-Solving Process
13:52:06,
.005
4.3. Evidence-based Reasoning 121
Hypothesis 1
Assessment of
Hypothesis 1
max
min min
Figure 4.11. Reductions and syntheses corresponding to two sufficient conditions (scenarios).
4.3.4 Indicators
Many times when we are assessing a hypothesis, we have only indicators. For example, as
shown at the bottom part of Figure 4.10, having the capability to produce enriched uranium
is an indicator that a country can build nuclear weapons. An indicator is, however, weaker
than a sufficient condition. If we determine that a sufficient condition is satisfied (e.g., a
scenario has actually happened), we may conclude that the hypothesis is true. But we
cannot draw such a conclusion just because we have discovered an indicator. However, we
may be more or less inclined to conclude that the hypothesis is true, based on the relevance
(strength) of the indicator. Therefore, given the symbolic probabilities from Table 2.5, we
distinguish between three types of indicators of different relevance (strength): “likely indica-
tor,” “very likely indicator,” and “almost certain indicator.”
A “likely indicator” is one that, if discovered to be true, would lead to the conclusion
that the considered hypothesis is likely. Similarly, a “very likely indicator” would lead to the
conclusion that the hypothesis is very likely, and an “almost certain indicator” would lead to
the conclusion that the hypothesis is almost certain.
In the example from the bottom part of Figure 4.10 it is likely that Country X can
produce enriched uranium, and this is a very likely indicator that Country X can build
nuclear weapons. Therefore, we can conclude that the probability of the hypothesis
that Country X can build nuclear weapons is likely, the minimum between likely (the
probability of the indicator) and very likely (the strength of the indicator).
In general, the probability of a hypothesis H based on an indicator I is the minimum
between the probability of the indicator and the relevance (strength) of the indicator (which
could be likely, very likely, or almost certain).
13:52:06,
.005
122 Chapter 4. Modeling the Problem-Solving Process
It makes no sense to consider the type “certain indicator,” because this would be a
sufficient condition. Similarly, it makes no sense to consider the type “no support indica-
tor,” because this would not be an indicator.
As an abstract example, Figure 4.12 shows a hypothesis that has two likely indicators,
A and B, if only one of them is observed. However, if both of them are observed, they
synergize to become an almost certain indicator.
As a concrete example, consider Person , who has been under surveillance in
connection with terrorist activities. We suspect that will attempt to leave the country
in a short while. Three days ago, we received information that sold his car. Today, we
received information that he closed his account at his bank. Each of these is only a likely
indicator of the hypothesis that plans to leave the country. He could be planning to buy
a new car, or he could be dissatisfied with his bank. But, taken together, these two
indicators suggest that it is almost certain that is planning to leave the country.
Coming back to the abstract example in Figure 4.12, let us assume that indicator A is
almost certain and indicator B is very likely. In such a case, the assessment of Hypothesis
1, based only on indicator A, is minimum(almost certain, likely) = likely. Similarly, the
assessment of Hypothesis 1, based only on indicator B, is minimum(very likely, likely) =
likely. But the assessment of Hypothesis 1, based on both indicators A and B, is
minimum(minimum(almost certain, very likely), almost certain) = very likely. Also, the
assessment of Hypothesis 1 based on all the indicators is the maximum of all the
individual assessments (i.e., very likely), because these are three alternative solutions
for Hypothesis 1.
Now we discuss the assessment of the leaf hypotheses of the argumentation structure,
based on the identified relevant evidence. Let us consider an abstract example where the
leaf hypothesis to be directly assessed based on evidence is Q (see Figure 4.13).
We begin by discussing how to assess the probability of hypothesis Q based only on one
item of favoring evidence Ek* (see the bottom of Figure 4.13). First notice that we call
this likeliness of Q, and not likelihood, because in classic probability theory, likelihood is
Hypothesis 1
very likely
max
13:52:06,
.005
4.4. Evidence-based Assessment 123
Q very likely
on balance
Inferential force of evidence on Q
P(Ek*|Q), while here we are interested in P(Q|Ek*), the posterior probability of Q given Ek*.
To assess Q based only on Ek*, there are three judgments to be made by answering
three questions:
The relevance question is: How likely is Q, based only on Ek* and assuming that Ek* is
true? If Ek* tends to favor Q, then our answer should be one of the values from likely
to certain. If Ek* is not relevant to Q, then our answer should be no support, because
Ek* provides no support for the truthfulness of Q. Finally, if Ek* tends to disfavor Q,
then it tends to favor the complement of Q, that is, Qc. Therefore, it should be used
as favoring evidence for Qc, as discussed later in this section.
The believability question is: How likely is it that Ek* is true? Here the answer should be
one of the values from no support to certain. The maximal value, certain, means
that we are sure that the event Ek reported in Ek* did indeed happen. The minimal
value, no support, means that Ek* provides us no reason to believe that the event
Ek reported in Ek* did happen. For example, we believe that the source of Ek* has
lied to us.
The inferential force or weight question is: How likely is Q based only on Ek*? The agent
automatically computes this answer as the minimum of the relevance and believ-
ability answers. What is the justification for this? Because to believe that Q is true
based only on Ek*, Ek* should be both relevant to Q and believable.
When we assess a hypothesis Q, we may have several items of evidence, some favoring Q
and some disfavoring Q. The agent uses the favoring evidence to assess the probability of
Q and the disfavoring evidence to assess the probability of Qc. As mentioned previously,
because the disfavoring evidence for Q is favoring evidence for Qc, the assessment process
for Qc is similar to the assessment for Q.
When we have several items of favoring evidence, we evaluate Q based on each of them
(as was explained previously), and then we compose the obtained results. This is illus-
trated in Figure 4.13, where the assessment of Q based only on Ei* (almost certain) is
13:52:06,
.005
124 Chapter 4. Modeling the Problem-Solving Process
composed with the assessment of Q based only on Ek* (likely), through the maximum
function, to obtain the assessment of Q based only on favoring evidence (almost certain). In
this case, the use of the maximum function is justified because it is enough to have one
item of evidence that is both very relevant and very believable to persuade us that the
hypothesis Q is true.
Let us assume that Qc based only on disfavoring evidence is likely. How should we
combine this with the assessment of Q based only on favoring evidence? As illustrated at
the top of Figure 4.13, the agent uses an on-balance judgment: Because Q is almost certain
and Qc is likely, it concludes that, based on all available evidence, Q is very likely.
In general, as indicated in the right and upper side of Table 4.1, if the assessment of Qc
(based on disfavoring evidence for Q) is higher than or equal to the assessment of Q
(based on favoring evidence), then we conclude that, based on all the available evidence,
there is no support for Q. If, on the other hand, the assessment of Q is strictly greater than
the assessment of Qc, then the assessment of Q is decreased, depending on the actual
assessment of Qc (see the left and lower side of Table 4.1).
One important aspect to notice is that the direct assessment of hypotheses based on
favoring and disfavoring evidence is done automatically by the agent, once the user
assesses the relevance and the believability of evidence.
Another important aspect to notice is that the evaluation of upper-level hypotheses (such
as those from Figure 4.10) requires the user to indicate what function to use when
composing the assessments of their direct subhypotheses. This was discussed in Section 4.3.
no no no no no no
support support support support support support
no no no no
likely likely
support support support support
very very no no no
likely
likely likely support support support
almost very no
certain certain likely
certain likely support
13:52:06,
.005
4.5. Hands On: Was the Cesium Stolen? 125
established that the cesium-137 canister is missing (see Figure 2.8 on p. 62). The next step
is to consider the competing hypotheses:
We have to put each of these hypotheses to work, to guide the collection of relevant
evidence. In Section 2.2.4, we have already discussed, at a conceptual level, the collec-
tion of evidence for hypothesis H2. Table 4.2 shows the result of our information
collection efforts.
The collected information from Table 4.2 suggests that the cesium-137 canister was
stolen with the panel truck having Maryland license MDC-578. This has led to the
development of the analysis tree in Figure 2.9 (p. 63). In this case study, you are going
to actually perform this analysis. You have to identify the “dots” in the information from
13:52:06,
.005
126 Chapter 4. Modeling the Problem-Solving Process
Table 4.2, which are fragments representing relevant items of evidence for the leaf
hypotheses in Figure 2.9. These dots are presented in Table 4.3.
This case study has several objectives:
When you associate an item of evidence with a hypothesis, the agent automatically
generates a decomposition tree like the one in Figure 4.14. The bottom part of Figure 4.14
shows the abstraction of the tree that is automatically generated by the agent when you
indicate that the item of evidence E005-Ralph favors the leaf hypothesis “The XYZ hazardous
material locker was forced.”
The agent also automatically generates the reduction from the top of Figure 4.14, where
the leaf hypothesis, “The XYZ hazardous material locker was forced,” is reduced to the
elementary hypothesis with the name, “The XYZ hazardous material locker was forced,” to be
directly assessed based on evidence. Although these two hypotheses are composed of the
same words, internally they are different, the latter being an instance introduced in the
agent’s ontology. This elementary hypothesis corresponds to the hypothesis Q in
Figure 4.13. The agent decomposes this hypothesis as shown in the bottom part of
Figure 4.14, which corresponds to the tree in Figure 4.13 except that there is only one
13:52:06,
.005
4.5. Hands On: Was the Cesium Stolen? 127
item of favoring evidence, namely E005-Ralph. After that, you have to assess the relevance
of this item of evidence to the considered hypothesis (e.g., likely), as well as its believability
(e.g., very likely), and the agent automatically composes them, from the bottom up, to
obtain the assessment of the leaf hypothesis. When you add additional items of evidence
as either favoring or disfavoring evidence, the agent extends the reasoning tree from
Figure 4.14 as indicated in Figure 4.13.
Figure 4.15 illustrates the selection of a synthesis function indicating how to evaluate
the probability of a node based on the probability of its children. You have to right-click on
the node (but not on any word in blue), select New Solution with. . ., and then select the
function from the displayed list.
Now you can perform the case study. Start Disciple-EBR, select the case study know-
ledge base “02-Evidence-based-Analysis/Scen,” and proceed as indicated in the instruc-
tions from the bottom of the opened window.
This case study illustrates several basic hypothesis analysis operations described in the
following.
13:52:06,
.005
128 Chapter 4. Modeling the Problem-Solving Process
label. Clicking on [REMOVE] will restore the leaf hypothesis under the Irrelevant
to label.
To associate another evidence item to a hypothesis, click on it in the left panel and
repeat the preceding operations.
To return to the Reasoner module, click on [REASONING] following the hypothesis.
13:52:06,
.005
4.5. Hands On: Was the Cesium Stolen? 129
13:52:06,
.005
130 Chapter 4. Modeling the Problem-Solving Process
The objective of this case study is to learn how to use Disciple-EBR to analyze hypotheses based
on evidence retrieved from the Internet, by associating search criteria with elementary hypoth-
eses, invoking various search engines (such as Google, Yahoo!, or Bing), identifying relevant
information, extracting evidence from it, and using the evidence to evaluate the hypotheses.
This case study concerns the hypothesis that the United States will be a global leader in
wind power within the next decade.
To search for evidence that is relevant to a leaf hypothesis, the agent guides you to
associate search criteria with it and to invoke various search engines on the Internet.
Figure 4.19 shows the corresponding interface of the Evidence module. Because the
[COLLECTION GUIDANCE] mode is selected in the left panel, it shows all the leaf hypotheses
and their current evidential support. If you click on one of these hypotheses, such as
“United States imports huge quantities of oil,” it displays this hypothesis in the right panel,
enabling you to define search criteria for it. You just need to click on the [NEW] button
following the Search criterion label, and the agent will open an editor in which you can
enter the search criterion.
Figure 4.20 shows two defined search criteria: “oil import by United States” and “top oil
importing countries.” You can now invoke Bing, Google, or Yahoo! with any one of these
criteria to search for relevant evidence on the Internet. This will open a new window with
the results of the search, as shown in Figure 4.21.
13:52:06,
.005
4.6. Hands On: Hypothesis Analysis 131
You have to browse the retrieved documents shown in Figure 4.21 and determine
whether any of them contains information that is relevant to the hypothesis that the
United States imports huge quantities of oil. Such a document is the second one, whose
content is shown in Figure 4.22.
You can now define one or several items of evidence with information copied from the
retrieved document, as illustrated in Figure 4.23. In the left panel of the Evidence module,
you switch the selection mode to [AVAILABLE EVIDENCE] and then click on [NEW]. As a
result, the right panel displays a partial name for the evidence E001- to be completed by
you. You then have to click on the [EDIT] button, which opens an editor where you can
copy the description of this item of evidence from the retrieved document. The result is
shown in the right panel of Figure 4.23.
You can define additional characteristics of this item of evidence, such as its type (as
will be discussed in Section 4.7), and you should indicate whether this item of evidence
favors or disfavors the hypothesis that the United States imports huge quantities of oil, as
explained previously.
13:52:06,
.005
132 Chapter 4. Modeling the Problem-Solving Process
In this case study, you will first select the hypothesis, ”United States will be a global
leader in wind power within the next decade,” and then you will browse its analysis tree to
see how it is reduced to simpler hypotheses that you have to assess by searching evidence
on the Internet. You will associate specific search criteria with the leaf hypotheses, invoke
specific search engines with those criteria, identify relevant Web information, define
evidence from this information, associate evidence with the corresponding hypotheses,
and evaluate its relevance and believability, with the goal of assessing the probability of the
top-level hypothesis.
Start Disciple-EBR, select the case study knowledge base “03-Evidence-Search/Scen,”
and proceed as indicated in the instructions from the bottom of the opened window.
This case study illustrates the following hypothesis analysis operation:
13:52:06,
.005
4.7. Believability Assessment 133
In the previous sections, we have discussed and illustrated how you may directly assess
the believability of an item of evidence. However, the Disciple-EBR agent has a significant
amount of knowledge about the various types of evidence and its believability credentials,
enabling you to perform a much deeper believability analysis, as will be discussed in this
section. You may wish to perform such a detailed believability analysis for those items of
evidence that are critical to the final result of the analysis. We will start with presenting a
classification or ontology of evidence.
Attempts to categorize evidence in terms of its substance or content would be a fruitless
task, the essential reason being that the substance or content of evidence is virtually
unlimited. What we have termed a substance-blind classification of evidence refers to a
classification of recurrent forms and combinations of evidence, based not on substance or
content, but on the inferential properties of evidence (Schum, 1994 [2001a], pp. 114–130;
Schum, 2011). In what follows, we identify specific attributes of the believability of various
recurrent types of evidence without regard to their substance or content.
Here is an important question you are asked to answer regarding the individual
kinds of evidence you have: How do you stand in relation to this item of evidence? Can
you examine it for yourself to see what events it might reveal? If you can, we say that the
evidence is tangible in nature. But suppose instead you must rely upon other persons
to tell you about events of interest. Their reports to you about these events are
examples of testimonial evidence. Figure 4.24 shows a substance-blind classification
of evidence based on its believability credentials. This classification is discussed in
the following sections.
13:52:06,
.005
134 Chapter 4. Modeling the Problem-Solving Process
evidence
There are two different kinds of tangible evidence: real tangible evidence and demon-
strative tangible evidence (Lempert et al., 2000, pp. 1146–1148). Real tangible evidence is
an actual thing and has only one major believability attribute: authenticity. Is this object
what it is represented as being or is claimed to be? There are as many ways of generating
deceptive and inauthentic evidence as there are persons wishing to generate it. Docu-
ments or written communications may be faked, captured weapons may have been
tampered with, and photographs may have been altered in various ways. One problem
is that it usually requires considerable expertise to detect inauthentic evidence.
Demonstrative tangible evidence does not concern things themselves but only repre-
sentations or illustrations of these things. Examples include diagrams, maps, scale models,
statistical or other tabled measurements, and sensor images or records of various sorts
such as IMINT, SIGINT, and COMINT. Demonstrative tangible evidence has three believ-
ability attributes. The first concerns its authenticity. For example, suppose we obtain a
hand-drawn map from a captured insurgent showing the locations of various groups in his
insurgency organization. Has this map been deliberately contrived to mislead our military
forces, or is it a genuine representation of the location of these insurgency groups?
The second believability attribute is accuracy of the representation provided by the
demonstrative tangible item. The accuracy question concerns the extent to which the
device that produced the representation of the real tangible item had a degree of sensitiv-
ity (resolving power or accuracy) that allows us to tell what events were observed. We
would be as concerned about the accuracy of the hand-drawn map allegedly showing
insurgent groups locations as we would about the accuracy of a sensor in detecting traces
of some physical occurrence. Different sensors have different resolving power that also
depends on various settings of their physical parameters (e.g., the settings of a camera).
The third major attribute, reliability, is especially relevant to various forms of sensors
that provide us with many forms of demonstrative tangible evidence. A system, sensor,
or test of any kind is reliable to the extent that the results it provides are repeatable or
13:52:06,
.005
4.7. Believability Assessment 135
consistent. You say that a sensing device is reliable if it provides the same image or report
on successive occasions on which this device is used.
The left side of Figure 4.25 shows how the agent assesses the believability of an item of
demonstrative tangible evidence Ei* as the minimum of its authenticity, accuracy, and
reliability.
Here are additional examples involving evidence that is tangible and that you can
examine personally to see what events it reveals.
Have a look at evidence item E009-MDDOTRecord in Table 4.3 (p. 126). The Maryland
DOT record, in the form of a tangible document, could be given to the analyst to verify
that the vehicle carrying MD license plate number MDC-578 is registered in the name of
the TRUXINC Company in Silver Spring, Maryland.
Now consider evidence item E008-GuardReport in Table 4.3. Here we have a document
in the form of a log showing that the truck bearing license plate number MDC-578 exited
the XYZ parking lot at 8:30 pm on the day in question. This tangible item could also be
made available to analysts investigating this matter.
min min
i
Source’s Source’s
competence credibility
authenticity of Ei* reliability of Ei* min min
13:52:06,
.005
136 Chapter 4. Modeling the Problem-Solving Process
An objective observer is one who bases a belief on the sensory evidence instead of desires
or expectations. Finally, if the source did base a belief on sensory evidence, how good was
this evidence? This involves information about the source's relevant sensory capabilities
and the conditions under which a relevant observation was made.
As indicated in Figure 4.24, there are several types of testimonial evidence. If the source
does not hedge or equivocate about what the source observed (i.e., the source reports that
he or she is certain that the event did occur), then we have unequivocal testimonial
evidence. If, however, the source hedges or equivocate in any way (e.g., "I'm fairly sure
that E occurred"), then we have equivocal testimonial evidence. The first question we
would ask a source of unequivocal testimonial evidence is: How did you obtain information
about what you have just reported? It seems that this source has three possible answers to
this question. The first answer is, "I made a direct observation myself.” In this case, we have
unequivocal testimonial evidence based upon direct observation. The second possible
answer is, "I did not observe this event myself but heard about its occurrence (or
nonoccurrence) from another person." Here we have a case of second hand or hearsay
evidence, called unequivocal testimonial evidence obtained at second hand. A third answer
is possible: "I did not observe event E myself nor did I hear about it from another source.
But I did observe events C and D and inferred from them that event E definitely occurred."
This is called testimonial evidence based on opinion, and it requires some very difficult
questions. The first concerns the source's credibility as far as his or her observation of
events C and D; the second involves our examination of whether we ourselves would infer
E based on events C and D. This matter involves our assessment of the source's reasoning
ability. It might well be the case that we do not question this source's credibility in
observing events C and D, but we question the conclusion that the source has drawn
from his or her observations that event E occurred. We would also question the certainty
with which the source has reported the opinion that E occurred. Despite the source’s
conclusion that “event E definitely occurred," and because of many sources of uncertainty,
we should consider that testimonial evidence based on opinion is a type of equivocal
testimonial evidence.
There are two other types of equivocal testimonial evidence. The first we call completely
equivocal testimonial evidence. Asked whether event E occurred or did not, our source
says, "I don't know," or, "I can't remember."
But there is another way a source of HUMINT can equivocate: The source can provide
probabilistically equivocal testimonial evidence in various ways: "I'm 60 percent sure that
event E happened”; or "I'm fairly sure that E occurred”; or, "It is very likely that
E occurred." We could look upon this particular probabilistic equivocation as an assess-
ment by the source of the source’s own observational sensitivity.
The right side of Figure 4.25 shows how a Disciple-EBR agent assesses the believability
of an item of testimonial evidence based upon direct observation Ek* by a source, as the
minimum of the source’s competence and credibility. The source’s competence is
assessed as the minimum of the source’s access and understandability, while the source’s
credibility is assessed as the minimum of the source’s veracity, objectivity, and observa-
tional sensitivity.
Here are some examples involving testimonial evidence from human sources that is not
hedged or qualified in any away.
Evidence item E014-Grace in Table 4.3 (p. 126) is Grace’s testimony that no one at the
XYZ Company had checked out the canister for work on any project. Grace states this
unequivocally. You should also note that she has given negative evidence saying the
13:52:06,
.005
4.7. Believability Assessment 137
cesium-137 was not being used by the XYZ Company. This negative evidence is very
important, because it strengthens our inference that the cesium-137 canister was stolen.
E006-Clyde in Table 4.3 is unequivocal testimonial evidence. It represents positive evidence.
Here are some examples involving testimonial evidence given by human sources who
equivocate or hedge in what they tell us.
Consider the evidence item E005-Ralph in Table 2.4 (p. 60). Here Ralph hedges a bit by
saying that the lock on the hazardous materials storage area appears to have been forced.
He cannot say for sure that the lock had been forced, so he hedges in what he tells us.
In new evidence regarding the dirty bomb example, suppose we have a source code-
named “Yasmin.” She tells us that she knew a man in Saudi Arabia named Omar al-
Massari. Yasmin says she is “quite sure” that Omar spent two years “somewhere” in
Afghanistan “sometime” in the years 1998 to 2000.
13:52:06,
.005
138 Chapter 4. Modeling the Problem-Solving Process
believability of Ei*
min
Source’s
believability
authenticity of Ei* accuracy of Ei*
min
Source’s Source’s
competence credibility
min min
Source’s Source’s
Source’s Source’s veracity objectivity
access understandability
Source’s
observational sensitivity
13:52:06,
.005
4.7. Believability Assessment 139
important concept from the field of law, where a chain of custody refers to the persons or
devices having access to the original evidence, the time at which they had such access, and
what they did to the original evidence when they had access to it. These chains of custody
add three major sources of uncertainty for intelligence analysts to consider, all of which
are associated with the persons in the chains of custody, whose competence and credibil-
ity need to be considered. The first and most important question involves authenticity:
Is the evidence received by the analyst exactly what the initial evidence said, and is it
complete? The other questions involve assessing the reliability and accuracy of the
processes used to produce the evidence if it is tangible in nature or also used to take
various actions on the evidence in a chain of custody, whether the evidence is tangible or
testimonial. As an illustration, consider the situation from Figure 4.27. We have an item of
testimonial HUMINT coming from a foreign national whose code name is “Wallflower,”
who does not speak English. Wallflower gives his report to the case officer Bob. This
report is recorded by Bob and then translated by Husam. Then Wallflower’s translated
report is transmitted to the report’s officer Marsha, who edits it and transmits it to the
analyst Clyde, who evaluates it.
Figure 4.28 shows how a Disciple-EBR agent may determine the believability of the
evidence received by the analyst. A more detailed discussion is provided in Schum
et al. (2009).
The case officer might have intentionally overlooked details in his recording of Wall-
flower’s report. Thus, as shown at the bottom of Figure 4.28, the believability of the
recorded testimony of Wallflower is the minimum between the believability of Wallflower
and the believability of the recording. Then Husam, the translator, may have intentionally
altered or deleted parts of this report. Thus, the believability of the translated recording is
the minimum between the believability of the recorded testimony and the believability of
the translation by Husam. Then Marsha, the report’s officer, might have altered or deleted
parts of the translated report of Wallflower’s testimony in her editing of it, and so on.
Wallflower’s
received Wallflower’s
testimony about reported Wallflower’s
translated Wallflower’s
Emir Z. in English. testimony about
testimony about recorded
Emir Z. in English.
Emir Z. in English. testimony about Wallflower’s
Emir Z. in Farsi. testimony about
E005-Emir- Emir Z. in Farsi.
Iran E004-Marsha-
report E003-Husam-
translation E002-Bob-
recording E001-Wallflower-
testimony
recording
translation
Clyde editing
transmission
Wallflower
• Competence
• Credibility
Bob
Marsha Husam A.
Marsha • Competence ce
e
• Competence • Competence
• Competence ce
e • Veracity Sony Recorder
• Credibility • Credibility
• Veracity SN 247 • Fidelity
• Fidelity • Reliability
• Reliability
• Security
13:52:06,
.005
140 Chapter 4. Modeling the Problem-Solving Process
The result of these actions is that the analyst receiving this evidence almost certainly
did not receive an authentic and complete account of it, nor did he receive a good account
of its reliability and accuracy. What Clyde received was the transmitted, edited, translated,
and recorded testimony of Wallflower. Although the information to make such an analysis
may not be available, the analyst should adjust the confidence in his conclusion in
recognition of these uncertainties.
This case study, which continues the analysis from Section 4.5 with the analysis of the
hypothesis, “The cesium-137 canister is used in a project without being checked-out from
the XYZ warehouse,” has two main objectives:
In Section 4.6, we have presented how you can define an item of evidence, and Figure 4.23
(p. 132) shows the definition of E001-US-top-oil-importer with type evidence. You can
specify the type by clicking on the [CHANGE] button. Figure 4.29, for instance, shows the
definition of E014-Grace. After you click on the [CHANGE] button, the agent displays the
various evidence types from the right panel. You just need to click on the [SELECT] button
following the correct type, which in this case is unequivocal testimonial evidence based upon
direct observation.
Once you have selected the type of E014-Grace, the agent displays it after the label Type
and asks for its source, which is Grace (see Figure 4.30).
As shown in Figure 4.30, we have also indicated that this item of evidence disfavors the
hypothesis “The missing cesium-137 canister is used in a project at the XYZ company.” As a
result, the agent introduced it into the analysis tree and generated a more detailed analysis
of its believability, which is shown in Figure 4.31.
You can now perform a more detailed believability analysis, as illustrated in Figure 4.32,
where we have assessed the competence, veracity, objectivity, and observational sensitiv-
ity of Grace, and the agent has automatically determined her believability.
In this case study, you will practice the preceding operations. You will first select the
hypothesis, “The cesium-137 canister is used in a project without being checked out from
13:52:06,
.005
4.8. Hands On: Believability Analysis 141
13:52:06,
.005
142 Chapter 4. Modeling the Problem-Solving Process
Figure 4.31. Decomposition of the believability assessment for an item of testimonial evidence.
the XYZ warehouse.” Then you will browse its analysis to see how it is reduced to simpler
hypotheses that need to be assessed based on the evidence. After that, you will represent a
new item of evidence, will associate it with the hypothesis to which it is relevant, assess its
relevance, evaluate its believability by assessing its credentials, and browse the resulting
analysis tree.
13:52:06,
.005
4.9. Drill-Down Analysis 143
An important feature of the Disciple-EBR agent is that it allows you to perform analyses at
different levels of detail. What this means is that a hypothesis may be reduced to many levels
of subhypotheses or just a few levels that are then assessed based on relevant evidence. The
same applies to assessing the believability of evidence. You may directly assess it, as was
illustrated in Figure 4.14 (p. 127), where the believability of E005-Ralph was assessed as very
likely. But if an item of evidence has an important influence on the analysis, then you may
wish to perform a deeper believability analysis, as was illustrated in Figure 4.32, where the
user assessed lower-level believability credentials. The user could have drilled even deeper
to assess the source’s access and understandability instead of his or her competence.
It may also happen that you do not have the time or the evidence to assess a
subhypothesis, in which case you may make various assumptions with respect to its
probability. Consider, for example, the analysis from the case study in Section 4.5, partially
shown in Figure 4.15 (p. 128) and the four subhypotheses of the top-level hypothesis. The
first three of these subhypotheses have been analyzed as discussed in the previous
sections. However, for the last subhypothesis, you have made the following assumption:
It is certain that the MDC-578 truck left with the cesium-137 canister.
Assumptions are distinguished from system-computed assessments by the fact that the
assumed probabilities have a yellow background.
13:52:06,
.005
144 Chapter 4. Modeling the Problem-Solving Process
You may provide justifications for the assumptions made. You may also experiment
with various what-if scenarios, where you make different assumptions to determine their
influence on the final result of the analysis.
Thus the agent gives you the flexibility of performing the analysis that makes the best
use of your time constraints and available evidence.
The Disciple-EBR shell includes a customized modeling assistant to model the hypoth-
esis analysis process. The following two case studies demonstrate its use.
The objective of this case study is to learn how to use Disciple-EBR to model the analysis of
a hypothesis. More specifically, you will learn how to:
This case study will guide you through the process of defining and analyzing a hypothesis
by using, as an example, the following hypothesis: “CS580 is a potential course for Mike
Rice.” You will first define the reduction tree shown in Figure 4.33. Then you will formalize
it and specify the synthesis functions.
Start Disciple-EBR, select the case study knowledge base “05-Modeling-Learning/Scen,”
and proceed as indicated in the instructions from the bottom of the opened window.
13:52:06,
.005
4.10. Hands On: Modeling, Formalization 145
This case study illustrates several important operations, which are described in the
following.
13:52:06,
.005
146 Chapter 4. Modeling the Problem-Solving Process
The objective of this case study is to learn how to use Disciple-EBR to model the analysis of
a hypothesis by reusing learned patterns. More specifically, you will learn how to:
You will first define the hypothesis by selecting an existing pattern and instantiating it to:
“CS681 is a potential course for Dan Bolt.” Then you will successively reduce it to simpler
hypotheses by reusing learned patterns. This will include the instantiation of variables
from the learned patterns.
Start Disciple-EBR, select the case study knowledge base “06-Analysis-Reuse/Scen” and
proceed as indicated in the instructions from the bottom of the opened window.
This case study illustrates several important operations described in the following.
13:52:06,
.005
4.12. Modeling Guidelines 147
The following are several knowledge engineering guidelines for modeling the reasoning
process. In general, we will refer to hypotheses in these guidelines, although the guidelines
are applicable to problems as well. To make this clearer, Guideline 4.1 uses the form “problem/
hypothesis,” and we will illustrate it with planning problems. However, the rest of the
guidelines refer only to “hypothesis,” although “hypothesis” may be replaced with “problem.”
13:52:06,
.005
148 Chapter 4. Modeling the Problem-Solving Process
Site 103:cross-section
Esmates the best plan for a military unit to work Near approach
(Right approach)
Bridge/River
gap = 25 meters
Far approach
(Left approach)
Site 108 Site 104
around damage to a transportaon infrastructure,
such as a damaged bridge or road.
Left bank
Right bank
Site 105
Site 107
damaged tunnels
Workaround damaged
bridges with fording
Workaround damaged
Workaround Workaround bridges with fixed bridges
damage damaged bridges
Workaround damaged
bridges with floang bridges
Workaround damaged
bridges with ras
Workaround
damaged roads
Figure 4.35. Sample top-level structuring of the reasoning tree.
13:52:06,
.005
4.12. Modeling Guidelines 149
1. Identify the hypothesis to be assessed and express it with a clear natural language sentence.
2. Select the instances and constants in the hypothesis.
3. Follow each hypothesis or subhypothesis with a single, concise, question relevant to
decomposing it. Ask small, incremental questions that are likely to have a single category of
answer (but not necessarily a single answer). This usually means asking who, what, where, what
kind of, whether it is this or that, and so on, not complex questions such as “Who and what?” or,
“What and where?”
4. Follow each question with one or more answers to that question. Express answers as complete
sentences, restating key elements of the question in the answer. Even well-formed, simple
questions are likely to generate multiple answers. Select the answer that corresponds to the
example solution being modeled and continue down that branch.
5. Select instances and constants in the question/answer pair.
6. Evaluate the complexity of each question and its answers. When a question leads to apparently
overly complex answers, especially answers that contain an “and” condition, rephrase the
question in a simpler, more incremental manner leading to simpler answers.
7. For each answer, form a new subhypothesis, several subhypotheses, or an assessment
corresponding to that answer by writing a clear, natural language sentence describing the new
subhypotheses or assessment. To the extent that it is practical, incorporate key relevant phrases
and elements of preceding hypothesis names in subhypotheses’ names to portray the expert’s
chain-of-reasoning thought and the accumulation of relevant knowledge. If the answer has led
to several subhypotheses, then model their solutions in a depth-first order.
8. Select instances and constants in each subhypothesis.
9. Utilize the formalization and reuse capabilities of Disciple to minimize the amount of new
modeling required, both for the current hypothesis and for other hypotheses.
13:52:06,
.005
150 Chapter 4. Modeling the Problem-Solving Process
user instance
instance of
Bob Sharp John Doe
<hypothesis>
<hypothesis>
Figure 4.38. Reduction used when new relevant factors may be added in the future.
13:52:06,
.005
4.13. Project Assignment 3 151
of semantically redundant rules, you should learn and reuse reduction patterns, as
illustrated in Figure 4.39.
Prototype a preliminary version of the agent that you will develop as part of your project by
working as a team to:
Pattern
learning
Which is a ?O2?
?O3
Pattern
instantiation
John Doe would be a good PhD advisor John Doe would be a good PhD advisor
with respect to the with respect to the
student learning experience criterion. student learning experience criterion.
Which is a student learning experience criterion? Which is a student learning experience criterion?
research group status criterion student opinion criterion
John Doe would be a good PhD advisor John Doe would be a good PhD advisor
with respect to the with respect to the
research group status criterion. student opinion criterion.
13:52:06,
.005
152 Chapter 4. Modeling the Problem-Solving Process
4.1. Review again Figures 4.3 and 4.4. Then illustrate the application of problem
reduction and solution synthesis with another symbolic integration problem.
4.2. Consider the reductions of Problem1 from Figure 4.40. Indicate the corresponding
solution syntheses.
4.3. Illustrate the reasoning in Figure 4.5 with the problem “Travel from Boston to New
York.” Hint: Consider the question, “Which is a transportation means I can use?”
4.4. How could you use the problem-level synthesis from Question 4.3 to obtain an
optimal solution? What might be some possible optimization criteria?
4.5. Illustrate the reasoning in Figure 4.6 with an example of your own.
4.6. You are considering whether a statement S is true. You search the Internet and find
two items of favoring evidence, E1* and E2*. You estimate that the relevance and the
believability of E1* are “almost certain” and “very likely,” respectively. You also
estimate that the relevance and the believability of E2* are “certain” and “likely,”
respectively. Based on this evidence, what is the probability that S is true? Draw a
reasoning tree that justifies your answer.
4.7. Define the concepts of relevance, believability, and inferential force of evidence.
Then indicate the appropriate synthesis functions and the corresponding solutions
in the reasoning tree from Figure 4.41.
4.8. What are the different types of tangible evidence? Provide an example of each type.
Problem1
Queson/Answer1 Queson/Answer2
13:52:06,
.005
4.14. Review Questions 153
Hypothesis H1 based
on favoring evidence
function =
Which is a favoring Which is a favoring
item of evidence? E1 item of evidence? E2
function = function =
Relevance Believability Relevance Believability
of E1 to H1 of E1 of E2 to H1 of E2
very likely likely certain almost certain
Figure 4.41. Sample reasoning tree for assessing the inferential force of evidence.
4.10. Provide some examples of tangible evidence for the hypothesis that John Doe
would be a good PhD advisor for Bob Sharp.
4.13. What are the different types of testimonial evidence? Provide an example of
each type.
4.14. Give some examples from your own experience when you have heard people
providing information about which they hedge or equivocate.
4.15. Provide some examples of testimonial evidence for the hypothesis that John Doe
would be a good PhD advisor for Bob Sharp.
4.17. Consider our discussion on the cesium-137 canister. Upon further investigation, we
identify the person who rented the truck as Omar al-Massari, alias Omer Riley. We
tell him that we wish to see his laptop computer. We are, of course, interested in
what it might reveal about the terrorists with whom he may be associating. He
refuses to tell us where his laptop is. What inferences might we draw from Omar al-
Massari’s refusal to provide us with his laptop computer?
4.18. What other items of evidence are missing so far in our discussion of the cesium-
137 case?
4.19. Provide some examples of missing evidence for the hypothesis that John Doe
would be a good PhD advisor for Bob Sharp.
4.20. Define the term authoritative record. Provide an example of an authoritative record.
4.22. What are some types of mixed evidence? Provide an example. Do you see any
example of mixed evidence in Table 4.3?
13:52:06,
.005
154 Chapter 4. Modeling the Problem-Solving Process
4.23. Provide some examples of mixed evidence for the hypothesis that John Doe would
be a good PhD advisor for Bob Sharp.
4.24. Can you provide other examples of mixed evidence from your own experience?
4.25. Which is the general reduction and synthesis logic for assessing a PhD advisor?
Indicate another type of problem that can be modeled in a similar way.
4.26. Use the knowledge engineering guidelines to develop a problem reduction tree for
assessing the following hypothesis based on knowledge from the ontology (not
evidence): “John Doe would be a good PhD advisor with respect to the employers
of graduates criterion.” You do not need to develop the ontology, but the questions
and answers from your reasoning tree should make clear what knowledge would
need to be represented in the ontology. The logic should be clear, all the statements
should be carefully defined, and the question/answer pairs should facilitate learn-
ing. Mark all the instances in the reasoning tree.
4.27. Rapidly prototype an agent that can assess the following hypothesis and others with
a similar pattern: “John Doe would be a good PhD advisor with respect to the
research publication criterion.” Hint: You may consider that a certain number of
publications corresponds to a certain probability for the research publications
criterion. For example, if someone has between 41 and 60 publications, you may
consider that it is very likely that he or she would be a good PhD advisor with
respect to that criterion.
4.28. Rapidly prototype an agent that can assess the following hypothesis and others with
a similar pattern: “John Doe would be a good PhD advisor with respect to the
research funding criterion.” Hint: You may consider that a certain average amount
of annual funding corresponds to a certain probability for the research funding
criterion. For example, if someone has between $100,000 dollars and $200,000, you
may consider that it is very likely that he or she would be a good PhD advisor with
respect to that criterion.
4.29. Rapidly prototype an agent that can assess the following hypothesis and others with
a similar pattern: “John Doe would be a good PhD advisor with respect to the
publications with advisor criterion.” Hint: You may consider that a certain number
of publications of PhD students with the advisor corresponds to a certain probabil-
ity for the publications with advisor criterion.
4.30. Rapidly prototype an agent that can assess the following hypothesis and others with
a similar pattern: “John Doe would be a good PhD advisor with respect to the
research group status criterion.”
13:52:06,
.005
5 Ontologies
An ontology is an explicit formal specification of the terms that are used to represent
an agent’s world (Gruber, 1993).
In an ontology, definitions associate names of entities in the agent’s world (e.g.,
classes of objects, individual objects, relations, hypotheses, problems) with human-
readable text and formal axioms. The text describes what a name means. The axioms
constrain the interpretation and use of a term. Examples of terms from the ontology of
the PhD advisor assessment agent include student, PhD student, professor, course, and
publication. The PhD advisor assessment agent is a Disciple agent that helps a PhD
student in selecting a PhD advisor based on a detailed analysis of several factors,
including professional reputation, learning experience of an advisor’s students, respon-
siveness to students, support offered to students, and quality of the results of previous
students (see Section 3.3). This agent will be used to illustrate the various ontology issues
discussed in this chapter.
The ontology is a hierarchical representation of the objects from the application
domain. It includes both descriptions of the different types of objects (called concepts or
classes, such as professor or course) and descriptions of individual objects (called instances
or individuals, such as CS580), together with the properties of each object and the
relationships between objects.
The underlying idea of the ontological representation is to represent knowledge in the
form of a graph (similar to a concept map) in which the nodes represent objects, situations,
or events, and the arcs represent the relationships between them, as illustrated in Figure 5.1.
The ontology plays a crucial role in cognitive assistants, being at the basis of knowledge
representation, user–agent communication, problem solving, knowledge acquisition, and
learning.
First, the ontology provides the basic representational constituents for all the elements
of the knowledge base, such as the hypotheses, the hypothesis reduction rules, and the
solution synthesis rules. It also allows the representation of partially learned knowledge,
based on the plausible version space concept (Tecuci, 1998), as discussed in Section 7.6.
Second, the agent’s ontology enables the agent to communicate with the user and with
other agents by declaring the terms that the agent understands. Consequently, the ontol-
ogy enables knowledge sharing and reuse among agents that share a common vocabulary
that they understand. An agreement among several agents to use a shared vocabulary in a
coherent and consistent manner is called ontological commitment.
155
13:52:21,
.006
156 Chapter 5. Ontologies
Third, the problem-solving methods or rules of the agent are applied by matching them
against the current state of the agent’s world, which is represented in the ontology. The
use of partially learned knowledge (with plausible version spaces) in reasoning allows
assessing hypotheses (or solving problems) with different degrees of confidence.
Fourth, a main focus of knowledge acquisition is the elicitation of the domain concepts
and of their hierarchical organization, as will be discussed in Section 6.3.
And fifth, the ontology represents the generalization hierarchy for learning, in which
specific problem-solving episodes are generalized into rules by replacing instances with
concepts from the ontology.
object
subconcept of
...
publication
course
subconcept of
professor
subconcept of article
subconcept of
university course subconcept of
associate professor
journal paper
instance of instance of
instance of
has as reading has as author
Mason-CS480 Doe 2000 John Doe
has as reading
U Montreal-CS780
professor
instance of instance of
13:52:21,
.006
5.3. Generalization Hierarchies 157
Other names used for expressing this type of relation are subclass of, type, and isa.
A concept Q is a direct subconcept of a concept P if an only if Q is a subconcept of P and
there is no other concept R such that Q is a subconcept of R and R is a subconcept of P.
One may represent the generality relations between the concepts in the form of a
partially ordered graph that is usually called a generalization hierarchy (see Figure 5.4).
The leaves of the hierarchy in Figure 5.4 are instances of the concepts that are represented
by the upper-level nodes. Notice that (the instance) John Doe is both an associate professor
and a PhD advisor. Similarly, a concept may be a direct subconcept of several concepts.
person
P
university employee
Q
faculty staff student
member member
13:52:21,
.006
158 Chapter 5. Ontologies
The objects in an application domain may be described in terms of their properties and
their relationships with each other. For example, Figure 5.5 represents Mark White as an
associate professor employed by George Mason University. In general, the value of a feature
may be a number, a string, an instance, a symbolic probability, an interval, or a concept
(see Section 5.9).
A feature is itself characterized by several features that have to be specified when defining
a new feature. They include its domain, range, superfeatures, subfeatures, and
documentation.
The domain of a feature is the concept that represents the set of objects that could have
that feature. The range is the set of possible values of the feature.
person
subconcept of
subconcept of
subconcept of
faculty member staff member
graduate undergraduate
subconcept of student student
subconcept of
instructor professor PhD advisor
subconcept of
BS student
subconcept of
John Smith John Doe Jane Austin Joan Dean Bob Sharp
instance of instance of
has as employer
Mark White George Mason University
13:52:21,
.006
5.5. Defining Features 159
For example, Figure 5.6 shows the representation of the has as employer feature. Its
domain is person, which means that only entities who are persons may have an employer.
Its range is organization, meaning that any value of such a feature should be an
organization.
There are several types of ranges that could be defined with Disciple-EBR: Concept,
Number, Symbolic interval, Text, and Any element.
We have already illustrated a range of type “Concept” (see Figure 5.6). A range of type
“Number” could be either a set or an interval of numbers, and the numbers could be
either integer or real. A range of type “Symbolic interval” is an ordered set of symbolic
intervals. A range of type “Text” could be any string, a set of strings, or a natural language
text. Finally, a range of type “Any element” could be any of the aforementioned entities.
As will be discussed in more detail in Chapter 7, the knowledge elements from the
agent’s knowledge base, including features, may be partially learned. Figure 5.7 shows an
example of the partially learned feature has as employer. The exact domain is not yet
known, but its upper and lower bounds have been learned as person and professor,
respectively. This means that the domain is a concept that is less general than or as
general as person. Similarly, the domain is more general than or as general as professor.
feature
subconcept of documentation
indicates the employer of a person
domain
has as employer person
range organization
Figure 5.6. The representation of a feature.
has as employer
Mark White George Mason University
feature
subconcept of
documentation
plausible upper bound: person
domain
has as employer plausible lower bound: professor
13:52:21,
.006
160 Chapter 5. Ontologies
Through further learning, the agent will learn that the actual domain is person, as
indicated in Figure 5.6.
Features are also organized in a generalization hierarchy, as illustrated in Figure 5.8.
Let us suppose that we want to represent the following information in the ontology: “John
Doe has written Windows of Opportunities.” This can be easily represented by using a
binary feature:
But let us now suppose that we want to represent “John Doe has written Windows of
Opportunities from 2005 until 2007.” This information can no longer be represented with a
binary feature, such as has as writing, that can link only two entities in the ontology. We
need to represent this information as an instance (e.g., Writing 1) of a concept (e.g.,
writing), because an instance may have any number of features, as illustrated in Figure 5.9.
Disciple-EBR can, in fact, represent n-ary features, and it generates them during
learning, but the ontology tools can display only binary features, and the user can define
only binary features.
domain force
has as economic factor
range
economic factor
domain force
has as information network or system
range
information network or system
domain force
has as commerce authority
range
commerce authority
domain force
has as transportation factor
range
transportation factor
domain force
has as strategic raw material
range
strategic raw material
domain force
has as transportation center
range
domain force transportation center
has as industrial factor
range
industrial factor
domain force
has as transportation network or system
range
transportation network or system
domain force
has as industrial authority
range
industrial authority
domain force
has as industrial center
range
industrial center
Figure 5.8. A generalization hierarchy of features from the COG domain (see Section 12.4).
13:52:21,
.006
5.7. Transitivity 161
writing
Windows of Opportunities
instance of
has as title
John Doe
has as author
Writing1
has as start time
2005
has as end time
2007
Figure 5.9. Representation of an n-ary relation as binary features.
subconcept of
student
subconcept of
graduate
student
subconcept of subconcept of
MS student MS student
instance of instance of
5.7 TRANSITIVITY
As one can see, subconcept of is a transitive relation and, in combination with instance of,
allows inferring new instance of relations. Let us consider, for example, the hierarchy
fragment from the middle of Figure 5.10. By applying the aforementioned properties of
instance of and subconcept of, one may infer that:
13:52:21,
.006
162 Chapter 5. Ontologies
5.8 INHERITANCE
For example, in the case of the ontology in Figure 5.11, one can infer:
The inheritance of properties is one of the most important strengths of an ontology, allowing a
compact and economical representation of knowledge. Indeed, if all the instances of a concept
C have the property P with the same value V, then it is enough to associate the property P with
the concept C because it will be inherited by each of the concept’s instances. There are,
however, two special cases of inheritance to which one should pay special attention: default
inheritance and multiple inheritance. They are discussed in the following subsections.
13:52:21,
.006
5.9. Concepts as Feature Values 163
retirement
faculty member age 66
subconcept of
subconcept of
retirement
assistant professor associate professor full professor age 70
instance-of instance-of
feature-1
instance-11 intance-21
... ...
feature-1
instance-11 intance-2n
... ...
feature-1
instance-1m intance-21
... ...
feature-1
instance-1m intance-2n
a PhD advisor. Therefore, John Doe will inherit features from both of them and there is a
potential for inheriting conflicting values. In such a case, the agent should use some
strategy in selecting one of the values. A better solution, however, is to detect such
conflicts when the ontology is built or updated, and to associate the correct feature value
directly with each element that would otherwise inherit conflicting values.
The previous section has discussed how the features of the concepts are inherited.
However, in all the examples given, the value of the feature was a number. The
same procedure will work if the value is an instance or a string. But what happens if
the value is a concept, which has itself a set of instances, as shown in the top part of
Figure 5.12?
13:52:21,
.006
164 Chapter 5. Ontologies
In this case, each instance of concept-1 inherits feature-1, the value of which is concept-
2, which is the set of all the instances of concept-2, as shown in the bottom part of
Figure 5.12.
One has to exercise care when defining features between concepts. For example, the
correct way to express the fact that a parent has a child is to define the following feature:
has as child
domain parent
range child
On the contrary, the expression, “parent has as child child” means that each parent is the
parent of each child.
Ontology matching allows one to ask questions about the objects in the ontology, such as:
“Is there a course that has as reading a publication by John Doe?”
We first need to express the question as a network fragment with variables, as illus-
trated in the top part of Figure 5.13. The variables represent the entities we are looking for.
We then need to match the network fragment with the ontology to find the values of the
variables, which represent the answer to our question.
For example, John Doe in the pattern is matched with John Doe in the ontology, as
shown in the right-hand side of Figure 5.13. Then, following the has as author feature (in
reverse), ?O2 is successfully matched with Doe 2000 because each of them is a publication.
Finally, following the has as reading feature (also in reverse), ?O1 is successfully matched
with Mason-CS480 and with U Montreal-CS780, because each of them is an instance of a
course. Therefore, one obtains two answers of the asked question:
Yes, Mason-CS480 has as reading Doe 2000, and Doe 2000 has as author John Doe.
Yes, U Montreal-CS780 has as reading Doe 2000, and Doe 2000 has as author John Doe.
course publication
is is
has as reading has as author
?O1 ?O2 John Doe
publication
course professor
subconcept of
subconcept-of article subconcept-of
subconcept of
university course associate professor
journal paper
instance-of instance-of
instance-of
has as reading has as author
Mason-CS480 Doe 2000 John Doe
has as reading
U Montreal-CS780
13:52:21,
.006
5.11. Hands On: Browsing an Ontology 165
One important aspect to notice is that the structure of the ontology is also a guide in
searching it. This significantly speeds up the matching process as compared, for example,
to a representation of the same information as a set of predicates.
The objective of this case study is to learn how to use the various ontology browsers of
Disciple-EBR: the Hierarchical Browser, the Feature Browser and the Feature Viewer, the
Association Browser, the Object Browser, and the Object Viewer. These tools are very
similar to the ontology browsers from many other knowledge engineering tools, such as
Protégé (2015) and TopBraid Composer (2012).
Figure 5.14 shows the interface and the main functions of the Hierarchical Browser of
Disciple-EBR, an ontology tool that may be used to browse a hierarchy of concepts and
instances. The hierarchy in Figure 5.14 is rotated, with the most general concept (object)
on the left-hand side and its subconcepts on its right-hand side. The hierarchy can be
rotated by clicking on the Rotate View button. Clicking on the Expand View button leads
to showing additional levels of the hierarchy, while clicking on the Reduce View button
leads to showing fewer levels.
Figure 5.15 shows the interface and the main functions of the Association Browser of
Disciple-EBR, which may be used to browse the objects and their features. This browser is
centered on a given object (e.g., John Doe), showing its features (e.g., John Doe has as
employer George Mason University), the features for which it is a value (e.g., Adam Pearce
has as PhD advisor John Doe), its direct concepts (e.g., PhD advisor and associate professor)
and, in the case of a concept, its direct subconcepts or its direct instances. Double-clicking
on any entity in the interface will center the Association Browser on that entity.
One may also browse the objects and their features by using the Object Browser and the
Object Viewer, as illustrated in Figure 5.16 and described in Operation 5.1.
Expand/reduce tree to
show more/fewer
hierarchy levels
Figure 5.14. The interface and the main functions of the Hierarchical Browser.
13:52:21,
.006
166 Chapter 5. Ontologies
Figure 5.16. The interface of the Object Browser (left) and the Object Viewer (right).
13:52:21,
.006
5.11. Hands On: Browsing an Ontology 167
You can browse the feature generalization hierarchy by using the Feature Browser, which
is illustrated in the left-hand side of Figure 5.17. To view the definition of a specific feature,
you may follow the steps in Operation 5.2.
Figure 5.17. The Feature Browser (left) and the Feature Viewer (right).
13:52:21,
.006
168 Chapter 5. Ontologies
Extend the preliminary version of the agent that you will develop as part of your project by
analyzing one leaf hypothesis based on several items of evidence, as discussed in Section
4.4 and practiced in the case study from Section 4.5.
5.5. What does it mean for a concept P to be more general than a concept Q?
5.6. What are the possible relationships between two concepts A and B, from a gener-
alization point of view? Provide examples of concepts A and B in each of the
possible relationships.
5.7. How could one prove that a concept A is more general than a concept B? Is the
proposed procedure likely to be practical?
5.8. How can one prove that a concept A is not more general than a concept B? Is the
proposed procedure likely to be practical?
5.9. Consider the feature hierarchy from Figure 5.18. Indicate the necessary relationship
between: (a) Domain B and Domain 1; (b) Range B and Range 1; (c) Domain A2
and Domain 1; (d) Domain A and Domain B; (e) Domain 1 and Range 1.
5.10. Consider the knowledge represented in Figure 5.11 (p. 163). What is the retirement
age of John Smith? What is the retirement age of Jane Austin?
5.11. Insert the additional knowledge that platypus lays eggs into the object ontology
from Figure 5.19. Explain the result.
5.14. Represent the following information as an ontology fragment: “Bob Sharp enrolled
at George Mason University in fall 2014.”
5.15. Consider the generalization hierarchy from Figure 5.4 (p. 158). Consider the
following information: In general, the retirement age of a faculty member is 66,
13:52:21,
.006
5.13. Review Questions 169
domain Domain 1
feature 1
range
Range 1
domain Domain B
feature B
range
Range B
domain Domain A
feature A
range
Range A
subconcept of
cow platypus
yellow
has as color
attracts
greater than greater than
revolves around
13:52:21,
.006
170 Chapter 5. Ontologies
but a full professor may retire at 70, although Jane Austin opted to retire at 66. How
could you represent this information?
5.16. How can we deal with the inheritance of contradictory properties? Provide an
example.
5.17. Define the main features of a feature and illustrate each of them with an example.
5.20. What is the meaning of the ontology fragment from Figure 5.22?
5.21. How could one represent the fact that a bird has a nest?
5.22. Explain how the following questions are answered based on the ontology fragment
from Figure 5.23, specifying the types of inference used in each case, and providing
the corresponding answers:
What is the color of membrane?
What does contact adhesive1 glue?
Which are the loudspeaker components made of metal?
5.23. Consider the background knowledge consisting of the object hierarchy from
Figure 5.23.
(a) Which are all the answers to the following question: “Is there a part of a
loudspeaker that is made of metal?”
(b) Which are the reasoning operations that need to be performed in order to
answer this question?
(c) Consider one of the answers that require all these operations and show how
the answer is found.
bird
subconcept of
robin nest
instance of instance of
owns
Clyde nest1
has as child
mother child
instance of instance of
13:52:21,
.006
object
loudspeaker
part of
inflammable object adhesive toxic substance material fragile object loudspeaker component
paper
made of
membrane
glues color
chassis membrane assembly
glues black
part of
contact adhesive mechanical chassis contains
mowicoll state fluid
.006
glues glues
provider
glues chassis membrane assembly1
glues made of mechanical chassis1
13:52:21,
surplus adhesive1
may have
air press air sucker acetone alcohol dust entrefer may have surplus paint
air press1 air sucker1 acetone1 alcohol1 dust1 entrefer1 surplus paint1
Figure 5.24. Ontology fragment from the loudspeaker domain (Tecuci, 1998). Dotted links indicate instance of relationships while unnamed continuous links
indicate subconcept of relationships.
5.13. Review Questions 173
5.24. Consider the ontology fragment from Figure 5.24. Notice that each of the most
specific concepts, such as dust or air press, has an instance, such as dust1 and air
press1, respectively.
(a) Represent the question “Is there a cleaner X that removes dust?” as a network
fragment.
(b) Find all the possible answers to this question based on the information in the
ontology fragment.
(c) In order to answer this question, the agent would need to use several
reasoning operations. Which are these operations?
5.25. Consider the following description in the context of the ontology fragment from
Figure 5.24:
?z is cleaner
removes surplus-paint1
5.26. Consider the following action description in the context of the ontology fragment
from Figure 5.24:
clean object ?x
of ?y
with ?z
condition
?x is entrefer
may have ?y
?y is object
?z is cleaner
removes ?y
Find all the possible values for the variables ?x, ?y, and ?z. Indicate some of the
corresponding actions.
13:52:21,
.006
6 Ontology Design and Development
Ontology design is a creative process whose first step is determining the scope of the
ontology by specifying its main concepts, features, and instances. One approach is to elicit
them from a subject matter expert or some other sources, as will be discussed in Section 6.3.
Another approach is to extract a specification of the ontology from the reasoning trees
developed as part of the rapid prototyping of the agent. During this phase, the subject
matter expert and the knowledge engineer define a set of typical hypotheses (or problems)
that the envisioned agent should be able to assess (or solve). Then they actually assess
these hypotheses the way they would like Disciple-EBR to assess them. This process
identifies very clearly what concepts and features should be present in the ontology to
enable the agent to assess those types of hypotheses. This modeling-based ontology
specification strategy will be discussed in Section 6.4. Once a specification of the ontology
has been developed, one has to complete its design.
Because ontology design and development is a complex process, it makes sense to
import relevant concepts and features from previously developed ontologies (including
those from the Semantic Web) rather than defining them from scratch. In particular, one
may wish to look for general-purpose ontologies, such as an ontology of time, space, or
units of measures, if they are necessary to the agent under development. Significant
foundational and utility ontologies have been developed and can be reused (Obrst et al.,
2012), as discussed in Section 3.2.2.
The actual development of the ontology is performed by using ontology tools such as
Protégé (Noy and McGuinness, 2001) or those that will be presented in this section. As will
be discussed next, ontology development is an iterative process during which additional
concepts, features, and instances are added while teaching the agent to assess hypotheses
(or solve problems).
An important aspect to emphasize is that the ontology will always be incomplete.
Moreover, one should not attempt to represent all of the agent’s knowledge in the
ontology. On the contrary, the ontology is intended to represent only the terms of the
representation language that are used in the definitions of hypotheses and rules. The more
complex knowledge will be represented as rules.
Table 6.1 presents the ontology development steps, which are also illustrated in Figure 6.1.
174
13:52:56,
.007
6.2. Steps in Ontology Development 175
1. Define basic concepts (types of objects) and their organization into a hierarchical structure (the
generalization hierarchy).
2. Define object features by using the previously defined concepts to specify their domains and
ranges.
3. Define instances (specific objects) by using the previously defined concepts and features.
4. Extend the ontology with new concepts, features, and instances.
5. Repeat the preceding steps until the ontology is judged to be complete enough.
subconcept of
person organization
subconcept of subconcept of
university employee student educational organization
subconcept of subconcept of subconcept of
staff member faculty member graduate student university
undergraduate student
associate professor
feature
documentation the employer
2. Use concepts to subfeature of of a person
define features has as employer domain person
range
organization
First one needs to define basic concepts and to organize them into a hierarchical
structure. This may be performed by using the Object Browser of Disciple-EBR, as will
be discussed in Section 6.5.
Once a set of basic concepts have been defined, one can define features that use these
concepts as their domains and ranges. For example, one may define the feature has as
employer with the domain person and range organization, which are previously defined
concepts. The features are defined and organized in a hierarchy by using the Feature
Browser and the Feature Editor, as discussed in Section 6.7.
13:52:56,
.007
176 Chapter 6. Ontology Design and Development
With some of the concepts and features defined, one can define instances of these
concepts and associate features with them, as discussed in Section 6.8. For example, one
can define Mark White as an instance of associate professor and specify its feature has as
employer with the value George Mason University.
Ontology development is an iterative process, as indicated by the last step in Table 6.1.
In the case of Disciple-EBR, one does not develop an ontology from scratch. Rather,
one extends the shared ontology for evidence-based reasoning. Moreover, as part of the
rapid prototyping phase, the user has defined the specific instances and the generic
instances used in the sample reasoning trees. All these entities are represented as
instances of the “user instance” concept. Thus, as part of ontology development, one
needs to move all these instances under their proper concepts, as illustrated in
Figure 6.2.
Concept elicitation consists of determining which concepts apply in the domain, what do
they mean, what is their relative place in the domain, what are the differentiating criteria
distinguishing similar concepts, and what is the organizational structure giving these
concepts a coherence for the expert (Gammack, 1987).
What are some natural ways of eliciting the basic concepts of a domain? Table 6.2 lists
the most common concept elicitation methods. The methods are briefly described in the
following subsections.
instance of instance of
user instance
George Mason
Bob Sharp
University
instance of
George Mason
Bob Sharp University
Artificial
John Doe Intelligence
Figure 6.2. Moving the instances of “user instance” under their corresponding concepts.
13:52:56,
.007
6.3. Domain Understanding and Concept Elicitation 177
Preliminary methods
Tutorial session delivered by expert
Ad-hoc list created by expert
Book index
Interviews with expert
Unstructured interview
Structured interview
Multiple-choice questions
Dichotomous questions
Ranking scale questions
Protocol analysis
Concept hierarchy elicitation
13:52:56,
.007
178 Chapter 6. Ontology Design and Development
Table 6.3 Multiple-Choice Question for the Diabetic Foot Advisor (Awad, 1996)
If a diabetic patient complains of foot problems, who should he or she see first (check one):
□ Podiatrist
□ General practitioner
□ Orthopedic surgeon
□ Physical therapist
Table 6.4 Dichotomous Question for the Diabetic Foot Advisor (Awad, 1996)
Please rank the following professional reputation criteria for a PhD advisor in the order of their
importance from your point of view. Give a rank of 1 to the most important criterion, a rank of 2 to
the second most important one, and so on:
___ research funding criterion
___ publications criterion
___ citations criterion
___ peer opinion criterion
What are the main characteristics of the structured interview method? It offers specific
choices, enables faster tabulation, and has less bias due to the way the questions are
formulated. However, this method is restricted by the requirement to specify choices.
13:52:56,
.007
6.4. Modeling-based Ontology Specification 179
generate much knowledge cheaply and naturally, and do not require a significant effort on
the part of the expert. However, they have an incomplete and arbitrary coverage, and the
knowledge engineer needs appropriate training and/or social skills.
Modeling-based ontology specification has already been introduced in Section 3.3. The
knowledge engineer and the subject matter expert analyze each step of the reasoning trees
13:52:56,
.007
180 Chapter 6. Ontology Design and Development
Satchwell
Time Switch Electric Time Controls
Programmer
Thermostat
Set Point Thermostat
Rotary Control Knob
Gas Control Valve
Gas Control Control Electricity
Solenoid
Electrical System
Electrical Supply Electrical Supply
Electrical Contact
Fuse Electrical Components
Pump
Mechanical Components
Motorized Valve
Figure 6.3. Concept hierarchy elicited through the card-sort method (Gammack, 1987).
developed as part of the rapid prototyping phase of agent development, in order to identify
the concepts and the features that should be in the ontology to enable the agent to perform
that reasoning.
Let us consider the reasoning step from the left-hand side of Figure 6.4. To enable the
agent to answer the question from this step, we may define the ontology fragment from the
right-hand side of Figure 6.4.
However, this reasoning step is just an example. We want the agent to be able to answer
similar questions, corresponding to similar hypotheses. Therefore, the ontology fragment
from the right-hand side of Figure 6.4 should be interpreted only as a specification of the
ontological knowledge needed by the agent. This specification suggests that we should
define various university positions, as well as various employers in the ontology, as shown
in Figure 6.5. It also suggests defining two features, has as position and has as employer, as
illustrated in Figure 6.6.
The hierarchy of concepts and instances can be developed by using the Object Browser,
which was introduced in Section 5.11. Its interface and main functions are shown in
Figure 6.7. This tool shows the hierarchy in a tree structure, which the user can expand
or collapse by selecting a node (e.g., educational organization) and then clicking on Expand
13:52:56,
.007
6.5. Hands On: Developing a Hierarchy 181
position actor
subconcept of subconcept of
subconcept of subconcept of
instance of instance of
feature
subfeature of
domain person
has as employer
range
organization
domain person
has as position
range position
13:52:56,
.007
182 Chapter 6. Ontology Design and Development
All and Collapse All, respectively. Similar effects may be obtained by clicking on the – and
+ nodes. Selecting a node and then clicking on the Hierarchical button will open the
Hierarchical Browser with that node as the top of the displayed hierarchy.
The Object Browser can be used to develop a generalization hierarchy by defining
concepts and instances, as described in Operation 6.1 and illustrated in Figure 6.8.
If an instance is to be used only in the current scenario, then it should be defined in the
Scenario part of the knowledge base, as described in Operation 6.2. The system will create
it as a specific instance.
13:52:56,
.007
6.5. Hands On: Developing a Hierarchy 183
2. Click on “Concept”
If an instance needs to be used in more than one scenario, then you have to define it as a
generic instance in the Domain part of the knowledge base, as described in Operation 6.3
and illustrated in Figure 6.9. Such an instance is visible both in the Domain KB and in all
its Scenario KBs and is displayed in regular font, as shown in Figure 6.10.
In the case of the PhD advisor assessment agent, the following instances have been
defined in the Domain part of the knowledge base, and are therefore generic instances:
The Scenario part still contains specific instances, such as John Doe, Bob Sharp, and George
Mason University. A specific instance is visible only in the corresponding Scenario KB and is
displayed in italics, as shown in Figure 6.10.
13:52:56,
.007
2. Notice the domain KB workspace
. Select a Domain KB
3. Define instances
.007
13:52:56,
Concepts
(light blue straight)
Specific instances
(dark blue italic)
Generic instances
(dark blue straight)
Notice that the specific instances are displayed with italic font only in the ontology
interfaces. In all the other interfaces, they are displayed in the regular font.
In addition to defining concepts and instances, one should also be able to rename or
delete them. These operations are performed as explained in Operations 6.4 and 6.5,
respectively. Deletion is a particularly complex operation. Disciple-EBR prevents the
deletion of an entity if this would lead to an inconsistent knowledge base where some of
the knowledge base elements refer to the element to be deleted.
13:52:56,
.007
186 Chapter 6. Ontology Design and Development
M M
A B C D1 A B C D
instance of
D1
13:52:56,
.007
6.6. Guidelines for Developing Generalization Hierarchies 187
subconcept of
A B C D E F … X Y Z
subconcept of
A B C D E F … X Y Z
subconcept of
ABC
subconcept of
A B C
13:52:56,
.007
188 Chapter 6. Ontology Design and Development
Artificial Intelligence Magazine is represented as an instance of a journal. But one could have
also represented it as a concept, the instances of which would have been specific issues of
the Artificial Intelligence Magazine. Whether something is represented as an instance or as a
concept influences how Disciple-EBR learns. In particular, when learning a reduction rule
from a reduction example, instances are generalized while the concepts are preserved as
such. Therefore, in many cases, learning considerations determine whether an entity is
represented as an instance or as a concept.
13:52:56,
.007
6.7. Hands On: Developing a Hierarchy 189
The hierarchy of features can be developed by using the Feature Browser. Its interface and
main functions are shown in Figure 6.16. Like the Object Browser, this tool shows the
feature hierarchy in a tree structure that the user can expand or collapse by selecting a
node (e.g., has as part) and then by clicking on Expand All and Collapse All, respectively.
Similar effects may be obtained by clicking on the – and + nodes. Selecting a node and
then clicking on the Hierarchical button opens the Hierarchical Browser with that node at
the top of the displayed hierarchy.
The features are defined and organized in a hierarchy by using the Feature Browser,
similarly to how the Object Browser is used to develop a concept hierarchy. The steps
needed to define a new feature (e.g., works for) are those described in Operation 6.6.
When a user defines a subfeature of a given feature, the domain and the range of the
subfeature are set to be the ones of the superfeature. The user can change them by clicking
on the Modify button of the Feature Browser. This invokes the Feature Editor, which is
student
subconcept of
13:52:56,
.007
190 Chapter 6. Ontology Design and Development
illustrated in Figure 6.17. Using the Feature Editor, the user can add or delete super-
features or subfeatures of the selected feature, in addition to modifying its domain
and range.
Figure 6.18 illustrates the process of changing the domain of the works for feature from
object to actor. This process consists of the steps described in Operation 6.7.
A range of type “Concept” is modified in the same way as a domain (see Operation 6.8).
13:52:56,
.007
6.7. Hands On: Developing a Hierarchy 191
Feature to be modified
Add/delete subfeatures
Browse and select a concept for the new range (e.g., actor).
Click on the Add to range button in the Object Browser pane.
Click on the Apply button in the <P2>[range] tab to commit the addition in the ontology.
For range types other than “Concept,” a type-specific editor is invoked after clicking on the
Add button. For example, Figure 6.19 illustrates the definition of a range, which is the
integer interval [0, 10].
Start Disciple-EBR, select the knowledge base “09-Ontology-Development-Features/
Scen,” and use the Feature Browser and the Feature Editor to represent the following
related features in a hierarchy:
works for, a subfeature of feature, with domain actor and range actor
is employed by, a subfeature of works for, with domain employee and range organization
13:52:56,
.007
192 Chapter 6. Ontology Design and Development
contracts to, a subfeature of works for, with domain actor (inherited from works for) and
range organization
With some concepts and features defined, one may use the Object Editor to define
instances of these concepts (as discussed in Section 6.5) and associate features with them.
Figure 6.20 illustrates the process of defining the is interested in feature of John Doe. The
steps of this process are those described in Operation 6.9.
13:52:56,
.007
6.8. Hands On: Defining Instances 193
The Object Editor can also be used to update the list of the direct superconcepts of
an object (instance or concept) and the list of the direct subconcepts or instances of a
concept. The actual steps to perform are presented in Operations 6.10 and 6.11. As with all
the ontology operations, Disciple-EBR will not perform them if they would lead to an
inconsistent ontology or a cycle along the subconcept of relation.
13:52:56,
.007
194 Chapter 6. Ontology Design and Development
13:52:56,
.007
6.9. Guidelines for Defining Features and Values 195
13:52:56,
.007
196 Chapter 6. Ontology Design and Development
But what if we want to represent the additional knowledge that he has written it from
2005 until 2007? We can no longer associate this information with the has as writing
feature. A solution is to define the concept writing and its instance Writing1, instead of
the feature has as writing, as was illustrated in Figure 5.9 (p. 161).
13:52:56,
.007
6.10. Ontology Maintenance 197
Two common practices are to add the “has as” prefix or the “of” suffix to the feature name,
resulting in has as author or author of, respectively.
domain domain
f A f A
A A
B B
C f 7 C f 7
13:52:56,
.007
198 Chapter 6. Ontology Design and Development
Use the reasoning trees developed as part of the rapid prototyping of your agent and
employ the modeling-based ontology specification method to extend the ontology of
your agent.
6.1. What are the basic concept elicitation methods? What are their main strengths?
What are their main weaknesses?
6.2. Briefly describe the card-sort method. What are its main strengths? What are its
main weaknesses? How could one modify this method to build a tangled hierarchy?
6.4. Describe and illustrate the decision of representing an entity as instance or concept.
6.5. Describe and illustrate the “Concept or Feature?” knowledge engineering guide-
lines presented in Section 6.9.1.
6.6. Describe and illustrate the decision of representing an entity as concept, instance,
or constant.
6.7. Describe and illustrate the “Naming Conventions” guidelines, presented in Section
6.6.4 for concepts, and in Section 6.9.3 for features.
6.8. Consider the question/answer pair from Figure 6.23. Specify the ontology frag-
ments that are suggested by this question/answer pair, including instances, con-
cepts, and features definitions (with appropriate domains and ranges that will
facilitate learning). Do not limit yourself to the concepts that are explicitly referred
to, but define additional ones as well, to enable the agent to assess similar hypoth-
eses in a similar way.
Hint 1: Notice that the answer represents an n-ary relation while in an ontology you
may only represent binary relations.
Hint 2: You need to define a hierarchy of concepts that will include those used in
the domains and ranges of the defined features.
Hint 3: Your solution should reflect the use of knowledge engineering guidelines.
… … …
13:52:56,
.007
6.12. Review Questions 199
6.9. Consider the reduction step from Figure 6.24. Specify the ontology fragments that
are suggested by this reasoning step. Do not limit yourself to the concepts and
features that are explicitly mentioned, but define additional ones as well, to enable
the agent to assess similar hypotheses in a similar way.
6.10. What instances, concepts, and relationships should be defined in the agent’s
ontology, based on the analysis of the reduction step from Figure 6.25?
6.11. What instances, concepts, and relationships should be defined in the agent’s
ontology, based on the analysis of the reduction step from Figure 6.26?
6.12. What instances, concepts, and relationships should be defined in the agent’s
ontology, based on the analysis of the reduction step from Figure 6.27?
6.13. Consider the design of an agent for assessing whether some actor (e.g.,
Aum Shinrikyo) is developing weapons of mass destruction (e.g., chemical
weapons). We would like our agent to be able to perform the sample reasoning
step from Figure 6.28, where the entities in blue are represented as specific
instances.
13:52:56,
.007
200 Chapter 6. Ontology Design and Development
(a) What relationships should you define in the agent’s ontology in order to
represent the meaning of the question/answer pair?
(b) Represent this meaning as a network fragment showing the relationships and
the related instances.
13:52:56,
.007
6.12. Review Questions 201
(c) Network fragments such as the one at (b) represent a specification of the
needed ontology, guiding you in defining a hierarchy of concepts to which
the identified instances belong, as well as the siblings of these concepts.
Indicate eight such concepts. Also indicate which might be the domain and
the range of each of the identified feature.
Apple1 is an apple.
The color of Apple1 is red.
Apple2 is an apple.
The color of Apple2 is green.
Apples are fruits.
Hint: You should define concepts, features, and instances.
Puss is a calico.
Herb is a tuna.
Charlie is a tuna.
All tunas are fishes.
All calicos are cats.
All cats like to eat all kinds of fish.
Cats and fishes are animals.
Hint: You should define concepts, features, and instances.
6.18. Explain why maintaining the consistency of the ontology is a complex knowledge
engineering activity.
6.19. One of the principles in the development of a knowledge base with a tool such as
Disciple-EBR is to maintain its consistency because correcting an inconsistent
knowledge base is a very complex problem. Therefore, the tool will not allow the
deletion of a knowledge base element (e.g., an instance, a fact, a concept, or a
feature definition) if that operation will make the knowledge base inconsistent. List
and explain five possible ways in which the deletion of a concept may render the
knowledge base inconsistent.
13:52:56,
.007
7 Reasoning with Ontologies and Rules
In Chapter 4, we presented the problem reduction and solution synthesis paradigm. In this
chapter, we will present how a knowledge-based agent can employ this paradigm to solve
problems and assess hypotheses.
Figure 7.1 shows the architecture of the agent, which is similar to that of a production
system (Waterman and Hayes-Roth, 1978). The knowledge base is the long-term memory,
which contains an ontology of concepts and a set of rules expressed with these concepts.
When the user formulates an input problem, the problem reduction and solution synthesis
inference engine applies the learned rules from the knowledge base to develop a problem
reduction and solution synthesis tree, as was discussed in Section 4. This tree is developed
in the reasoning area, which plays the role of the short-term memory.
The ontology from the knowledge base describes the types of objects (or concepts) in
the application domain, as well as the relationships between them. Also included are the
instances of these concepts, together with their properties and relationships.
The rules are IF-THEN structures that indicate the conditions under which a general
problem (or hypothesis) can be reduced to simpler problems (hypotheses), or the solu-
tions of the simpler problems (hypotheses) can be combined into the solution of the more
complex problem (hypothesis).
The applicability conditions of these rules are complex concepts that are expressed by
using the basic concepts and relationships from the ontology, as will be discussed in
Section 7.2. The reduction and synthesis rules will be presented in Section 7.3. This section
student
IF the problem to solve is P1g
IF the problem to solve is P1g
IF the problem to solve is P1g
Condition
P11 S11 … P S
1n
Area
1n
subconcept of
IF the problem to solve is P1g
Condition
university employee Except-When
… ConditionIF the problem to solve is P1g
Condition
IF the problem to solve is P1g
subconcept of
Except-When Condition
staff member
subconcept of
faculty member graduate
undergraduate
student
…
Except-When
Condition
Except-When
… Condition
Except-When Except-When
Condition
Condition
Condition
Condition P1n1 S1n1 … P1nm S1nm
student … Condition
subconcept of Except-When
… Condition Condition
THEN solve itsExcept-When
sub-problems
subconcept of
Except-When
… Condition Condition
instructor professor PhD advisor subconcept of
BS student THEN1 …
P1g solve
1 itsExcept-When
Png sub-problems
subconcept of THEN1 …
P1g solve1 itsExcept-When
Png sub-problems Condition
assistant associate full
THEN
P1g1 … solve1 itsExcept-When
Png sub-problems Condition
solve1 its sub-problems
professor
instance of
professor
instance of
professor
instance of
MS student
instance of
PhD student
instance of
THEN
P1g1 …
THEN
P1g
Png
1 … solve1 its sub-problems
Png
1 … P1
P1n11 S1n11 … P1n1q S1n1q
P1g ng
John Smith John Doe Jane Austin Joan Dean Bob Sharp
Ontology output
Solution
Development and
Rule Learning Browsing
202
13:52:57,
.008
7.2. Complex Ontology-based Concepts 203
will also present the overall reduction and synthesis algorithm. Section 7.4 will present the
simplified reduction and synthesis rules used for evidence-based hypotheses analysis, and
Section 7.5 will present the rule and ontology matching process. Finally, Section 7.6 will
present the representation of partially learned knowledge, and Section 7.7 will present the
reasoning with this type of knowledge.
Using the concepts and the features from the ontology, one can define more complex
concepts as logical expressions involving these basic concepts and features. For example,
the concept “PhD student interested in an area of expertise” may be expressed as
shown in [7.1].
Because a concept represents a set of instances, the user can interpret the preceding
concept as representing the set of instances of the tuple (?O1, ?O2), which satisfy the
expression [7.1], that is, the set of tuples where the first element is a student and the
second one is the research area in which the student is interested. For example, Bob
Sharp, a PhD student interested in artificial intelligence, is an instance of the concept [7.1].
Indeed, the following expression is true:
In general, the basic representation unit (BRU) for a more complex concept has the
form of a tuple (?O1, ?O2, . . . , ?On), where each ?Oi has the structure indicated by [7.2],
called a clause.
Concepti is either an object concept from the object ontology (such as PhD student), a
numeric interval (such as [50 , 60]), a set of numbers (such as {1, 3, 5}), a set of strings
(such as {white, red, blue}), or an ordered set of intervals (such as (youth, mature)). ?Oi1 . . .
?Oim are distinct variables from the sequence (?O1, ?O2, . . . , ?On).
A concept may be a conjunctive expression of form [7.3], meaning that any instance of
the concept satisfies BRU and does not satisfy BRU1 and . . . and does not satisfy BRUp.
13:52:57,
.008
204 Chapter 7. Reasoning with Ontologies and Rules
For example, expression [7.5] represents the concept “PhD student interested in an area of
expertise that does not require programming.”
An agent can solve problems through reduction and synthesis by using problem reduction
rules and solution synthesis rules. The rules are IF-THEN structures that indicate the
conditions under which a general problem can be reduced to simpler problems, or the
solutions of the simpler problems can be combined into the solution of the more complex
problem.
The general structure of a problem reduction rule Ri is shown in Figure 7.2. This rule
indicates that solving the problem P can be reduced to solving the simpler problems
Pi1, . . ., Pini, if certain conditions are satisfied. These conditions are expressed in two
equivalent forms: one as a question/answer pair in natural language that is easily under-
stood by the user of the agent, and the other as a formal applicability condition expressed
as a complex concept having the form [7.4], as discussed in the previous section.
Consequently, there are two interpretations of the rule in Figure 7.2:
(1) If the problem to solve is P, and the answer to the question QRi is ARi, then one
can solve P by solving the subproblems Pi1, . . . , Pini.
(2) If the problem to solve is P, and the condition CRi is satisfied, and the conditions
ERi1, . . ., ERiki are not satisfied, then one can solve P by solving the subproblems
Pi1, . . . , Pini.
Question QRi
Answer ARi
THEN solve
Subproblem Pi1
…
Subproblem Pini
13:52:57,
.008
7.3. Reduction and Synthesis Rules 205
An example of a problem reduction rule is shown in Figure 7.3. It reduces the IF problem
to two simpler problems. This rule has a main condition and no Except-When conditions.
Notice also that the main condition is expressed as the concept (?O1, ?O2, ?O3, ?O4).
As discussed in Section 4.2 and illustrated in Figure 4.6 (p. 116), there are two synthesis
operations associated with a reduction operation: a reduction-level synthesis and a
problem-level synthesis. These operations are performed by employing two solution
synthesis rules that are tightly coupled with the problem reduction rule, as illustrated in
Figure 7.4 and explained in the following paragraphs.
For each problem reduction rule Ri that reduces a problem P to the subproblems
P 1, . . . , Pini (see the left-hand side of Figure 7.4), there is a reduction-level solution
i
synthesis rule (see the upper-right-hand side of Figure 7.4). This reduction-level solution
synthesis rule is an IF-THEN structure that expresses the condition under which the
solutions Si1, . . . , Sini of the subproblems Pi1, . . . , Pini of the problem P can be combined
into the solution Si of P corresponding to the rule Ri.
Let us now consider all the reduction rules Ri, . . . , Rm that reduce problem P to simpler
problems, and the corresponding synthesis rules SRi, . . . , SRm. In a given situation, some
of these rules will produce solutions of P, such as the following ones: Based on Ri, the
Main Condition
?O1 is industrial economy
?O2 is industrial capacity
generates essential war material from
the strategic perspective of ?O3
?O3 is multistate force
has as member ?O4
?O4 is force
has as economy ?O1
has as industrial factor ?O2
13:52:57,
.008
206 Chapter 7. Reasoning with Ontologies and Rules
solution of P is Si, . . . , and based on Rm, the solution of P is Sm. The synthesis rule SP
corresponding to the problem P combines all these rule-specific solutions of the problem
P (named Si, . . . , Sm) into the solution S of P (see the bottom-right side of Figure 7.4).
The problem-solving algorithm that builds the reduction and synthesis tree is pre-
sented in Table 7.1.
In Section 4.3, we discussed the specialization of inquiry-driven analysis and synthesis for
evidence-based reasoning, where one assesses the probability of hypotheses based on
evidence, such as the following one:
Moreover, the assessment of any such hypothesis has the form, “It is <probability> that
H,” which can be abstracted to the actual probability, as in the following example, which
can be abstracted to “very likely:”
It is very likely that John Doe would be a good PhD advisor for Bob Sharp.
13:52:57,
.008
7.5. Rule and Ontology Matching 207
As a result, the solution synthesis rules have a simplified form, as shown in Figure 7.5. In
particular, notice that the condition of a synthesis rule is reduced to computing the
probability of the THEN solution based on the probabilities for the IF solutions.
In the current implementation of Disciple-EBR, the function for a reduction-level
synthesis rule SR could be one of the following (as discussed in Section 4.3): min, max,
likely indicator, very likely indicator, almost certain indicator, or on balance. The
function for a problem-level synthesis can be only min or max.
Figure 7.6 shows an example of a reduction rule and the corresponding synthesis rules.
The next section will discuss how such rules are actually applied in problem solving.
The agent (i.e., its inference engine) will look for all the reduction rules from the know-
ledge base with an IF hypothesis that matches the preceding hypothesis. Such a rule is the
one from the right-hand side of Figure 7.7. As one can see, the IF hypothesis becomes
identical with the hypothesis to be solved if ?O1 is replaced with Bob Sharp and ?O2 is
replaced with John Doe. The rule is applicable if the condition of the rule is satisfied for
these values of ?O1 and ?O2.
13:52:57,
.008
208 Chapter 7. Reasoning with Ontologies and Rules
IF
Based on Ri it is < probability pi> that H
…
Based on Rm it is < probability pm> that H
Figure 7.5. Reduction and synthesis rules for evidence-based hypotheses analysis.
The partially instantiated rule is shown in the right-hand side of Figure 7.8. The agent
has to check that the partially instantiated condition of the rule can be satisfied. This
condition is satisfied if there is any instance of ?O3 in the object ontology that satisfies all
the relationships specified in the rule’s condition, which is shown also in the left-hand side
of Figure 7.8. ?Sl1 is an output variable that is given the value certain, without being
constrained by the other variables.
The partially instantiated condition of the rule, shown in the left-hand side of Fig-
ures 7.8 and 7.9, is matched successfully with the ontology fragment shown in the right-
hand side of Figure 7.9. The questions are: How is this matching performed, and is it
efficient?
John Doe from the rule’s condition (see the left-hand side of Figure 7.9) is matched with
John Doe from the ontology (see the right-hand side of Figure 7.9).
Following the feature is expert in, ?O3 has to match Artificial Intelligence:
This matching is successful because both ?O3 and Artificial Intelligence are areas of
expertise, and both are the values of the feature is interested in of Bob Sharp. Indeed,
Artificial Intelligence is an instance of Computer Science, which is a subconcept of area
of expertise.
13:52:57,
.008
7.5. Rule and Ontology Matching 209
Figure 7.6. Examples of reduction and synthesis rules for evidence-based hypotheses analysis.
Hypothesis
Bob Sharp is interested in an area of
expertise of John Doe.
13:52:57,
.008
210 Chapter 7. Reasoning with Ontologies and Rules
Hypothesis
Bob Sharp is interested in an area of
expertise of John Doe.
THEN conclude
John Doe ?Sl1 = certain
It is ?SI1 that Bob Sharp is interested in
an area of expertise of John Doe.
?O1 Bob Sharp ?O2 John Doe ?O3 Artificial Intelligence ?Sl1 certain
As the result of this matching, the rule’s ?O3 variable is instantiated to Artificial
Intelligence:
?O3 ⟵ Artificial Intelligence
Also, ?Sl1 will take the value certain, which is one of the values of probability, as
constrained by the rule’s condition.
13:52:57,
.008
7.5. Rule and Ontology Matching 211
The matching is very efficient because the structure used to represent knowledge (i.e.,
the ontology) is also a guide for the matching process, as was illustrated previously and
discussed in Section 5.10.
Thus the rule’s condition is satisfied for the following instantiations of the variables:
Therefore, the rule can be applied to reduce the IF hypothesis to an assessment. This
entire process is summarized in Figure 7.10, as follows:
(1) The hypothesis to assess is matched with the IF hypothesis of the rule, leading to
the instantiations of ?O1 and ?O2.
(2) The corresponding instantiation of the rule’s condition is matched with the
ontology, leading to instances for all the variables of the rule.
(3) The question/answer pair and the THEN part of the rule are instantiated, gener-
ating the following reasoning step shown also in the left-hand side of Figure 7.10:
Hypothesis to assess:
Assessment:
Condition
2 Match rule’s condition with the ontology
?O1 is PhD student
?O1 Bob Sharp is interested in?O3
?O2 John Doe ?O2 is PhD advisor
?O3 Artificial Intelligence is expert in ?O3
?SI1 certain ?O3 is area of expertise
?SI1 is-in [certain - certain]
Assessment Instantiate
It is certain that Bob Sharp is interested 3 assessment
THEN conclude
13:52:57,
.008
212 Chapter 7. Reasoning with Ontologies and Rules
In the preceding example, the agent has found one instance for each rule variable, which
has led to a solution. What happens if it cannot find instances for all the rule’s variables?
In such a case, the rule is not applicable.
But what happens if it finds more than one set of instances? In that case, the agent will
generate an assessment for each distinct set of instances.
What happens if there is more than one applicable reduction rule? In such a case, the
agent will apply each of them to find all the possible reductions.
All the obtained assessments will be combined into the final assessment of the hypoth-
esis, as was discussed in the previous section.
Figure 7.11 shows the successive applications of two reduction rules to assess an initial
hypothesis. Rule 1 reduces it to three subhypotheses. Then Rule 2 finds the assessment of
the first subhypothesis.
Rule 1
Which are the necessary conditions?
Bob Sharp should be interested in an area of expertise of John Doe, who should
stay on the faculty of George Mason University for the duration of the dissertation
of Bob Sharp and should have the qualities of a good PhD advisor.
Bob Sharp is John Doe will stay on the John Doe would be a
interested in an faculty of George Mason good PhD advisor with
area of expertise University for the duration of respect to the PhD
of John Doe. the dissertation of Bob Sharp. advisor quality criterion.
Rule 2
13:52:57,
.008
7.6. Partially Learned Knowledge 213
general concepts from the version space, and the lower bound contains the least general
concepts from the version space. Any concept that is more general than (or as general as) a
concept from the lower bound and less general than (or as general as) a concept from the
upper bound is part of the version space and may be the actual concept to be learned.
Therefore, a version space may be regarded as a partially learned concept.
The version spaces built by Disciple-EBR during the learning process are called plaus-
ible version spaces because their upper and lower bounds are generalizations based on an
incomplete ontology. Therefore, a plausible version space is only a plausible approxima-
tion of the concept to be learned, as illustrated in Figure 7.12.
The plausible upper bound of the version space from the right-hand side of Figure 7.12
contains two concepts: “a faculty member interested in an area of expertise” (see expres-
sion [7.6]) and “a student interested in an area of expertise” (see expression [7.7]).
The plausible lower bound of this version space also contains two concepts, “an
associate professor interested in Computer Science,” and “a graduate student interested
in Computer Science.”
The concept to be learned (see the left side of Figure 7.12) is, as an approximation, less
general than one of the concepts from the plausible upper bound, and more general than
one the concepts from the plausible lower bound.
The notion of plausible version space is fundamental to the knowledge representation,
problem-solving, and learning methods of Disciple-EBR because all the partially learned
concepts are represented using this construct, as discussed in the following.
13:52:57,
.008
214 Chapter 7. Reasoning with Ontologies and Rules
they could also be complex concepts of the form shown in Section 7.2. Moreover, in the case of
partially learned features, they are plausible version spaces, as illustrated in Figure 5.7 (p. 159).
As will be discussed in Section 9.10, the agent learns general hypotheses with applic-
ability conditions from specific hypotheses. An example of a general hypothesis learned
from [7.8] in shown in [7.9].
Name [7.9]
?O1 is a potential PhD advisor for ?O2.
Condition
?O1 instance of faculty member
?O2 instance of person
The condition is a concept that, in general, may have the form [7.4] (p. 203). The purpose
of the condition is to ensure that the hypothesis makes sense for each hypothesis instanti-
ation that satisfies it. For example, the hypothesis from [7.8] satisfies the condition in [7.9]
because John Doe is a faculty member and Bob Sharp is a person. However, the hypothesis
instance in [7.10] does not satisfy the condition in [7.9] because 45 is not a faculty member.
The condition in [7.9] will prevent the agent from generating the instance in [7.10].
A partially learned hypothesis will have a plausible version space condition, as illus-
trated in [7.11].
Name [7.11]
?O1 is a potential PhD advisor for ?O2.
Plausible Upper Bound Condition
?O1 instance of person
?O2 instance of person
Plausible Lower Bound Condition
?O1 instance of {associate professor, PhD advisor}
?O2 instance of PhD student
13:52:57,
.008
7.7. Reasoning with Partially Learned Knowledge 215
advisor or an associate professor who is an expert in artificial intelligence, while the upper
bound allows ?O2 to be any person who is an expert in any area of expertise ?O3 in which
?O1 is interested.
A learning agent should be able to reason with partially learned knowledge. Figure 7.14
shows an abstract rule with a partially learned applicability condition. It includes a Main
plausible version space condition (in light and dark green) and an Except-When plausible
version space condition (in light and dark red). The reductions generated by a partially
learned rule will have different degrees of plausibility, as indicated in Figure 7.14.
For example, a reduction r1 corresponding to a situation where the plausible lower bound
condition is satisfied and none of the Except-When conditions is satisfied is most likely to
be correct. Similarly, r2 (which is covered by the plausible upper bound, and is not
covered by any bound of the Except-When condition) is plausible but less likely than r1.
The way a partially learned rule is used depends on the current goal of the agent. If the
current goal is to support its user in problem solving, then the agent will generate the
solutions that are more likely to be correct. For example, a reduction covered by the
plausible lower bound of the Main condition and not covered by any of the Except-When
conditions (such as the reduction r1 in Figure 7.14) will be preferable to a reduction
covered by the plausible upper bound of the Main condition and not covered by any of the
Except-When conditions (such as the reduction r2), because it is more likely to be correct.
However, if the current goal of the agent is to improve its reasoning rules, then it is
more useful to generate the reduction r2 than the reduction r1. Indeed, no matter how the
user characterizes r2 (either as correct or as incorrect), the agent will be able to use it to
13:52:57,
.008
216 Chapter 7. Reasoning with Ontologies and Rules
refine the rule, either by generalizing the plausible lower bound of the Main condition to
cover r2 (if r2 is a correct reduction), or by specializing the plausible upper bound of the
Main condition to uncover r2 (if r2 is an incorrect reduction), or by learning an additional
Except-When condition based on r2 (again, if r2 is an incorrect reduction).
7.1. What does the following concept represent? What would be an instance of it?
7.2. Illustrate the problem-solving process with the hypothesis, the rule, and the ontol-
ogy from Figure 7.15.
7.3. Consider the reduction rule and the ontology fragment from Figure 7.16. Indicate
whether this agent can assess the hypothesis, “Bill Bones is expert in an area of
interest of Dan Moore,” and if the answer is yes, indicate the result.
7.4. Consider the following problem:
13:52:57,
.008
7.8. Review Questions 217
7.6. Consider the partially learned concept and the nine instances from Figure 7.20.
Order the instances by the plausibility of being instances of this concept and justify
the ordering.
13:52:57,
.008
218 Chapter 7. Reasoning with Ontologies and Rules
Question
Who or what is a strategically critical element
with respect to the ?O1 ?
Answer
?O2 because it is an essential generator of war
material for ?O3 from the strategic perspective.
Condition
?O1 is industrial economy
?O2 is industrial capacity
generates essential war material from
the strategic perspect ive of ?O3
?O3 is multistate force
has as member ?O4
?O4 is force
has as economy ?O1
has as industrial factor ?O2
Figure 7.17. Reduction rule from the center of gravity analysis domain.
7.7. Consider the ontology fragment from the loudspeaker manufacturing domain
shown in Figure 5.24 (p. 172). Notice that each most specific concept, such as dust
or air press, has an instance, such as dust1 or air press1.
Consider also the following rule:
13:52:57,
.008
7.8. Review Questions 219
object
multistate multistate
alliance coalition
US 1943
dominant partner equal partners industrial capacity
multistate alliance multistate alliance UK 1943
has as member has as industrial factor has as economy generates essential war material
domain multimember force domain force domain force from the strategic perspect ive of
range force range industrial capacity range economy domain capacity
range force
Figure 7.18. Feature definitions and ontology fragment from the center of gravity analysis domain.
Dotted links indicate “instance of” relationships while unnamed continuous links indicate “sub-
concept of” relationships.
13:52:57,
.008
object A PhD advisor will stay on the faculty of George Mason
University for the duration of the dissertation of Joe Dill.
actor position
IF the hypothesis to assess is
person organization A PhD advisor will stay on the faculty of ?O2 for
the duration of the dissertation of ?O3.
employee student Q: Will a PhD advisor stay on the faculty of ?O2
educational organization for the duration of the dissertation of O3?
subconcept of Condition
university long-term position ?O1 is PhD advisor
faculty member
PhD student has as position ?O4
has as employer ?O2
.008
Figure 7.19. Ontology, hypothesis, and potential rule for assessing it.
7.8. Review Questions 221
I6
I9
I2
I3 I7 I8
I1
I4 I5
13:52:57,
.008
8 Learning for Knowledge-based Agents
The previous chapters introduced the main knowledge elements from the knowledge base
of an agent, which are all based on the notion of concept. This chapter presents the basic
operations involved in learning, including comparing the generality of concepts, general-
izing concepts, and specializing concepts. We start with a brief overview of several
machine learning strategies that are particularly useful for knowledge-based agents.
“Learning denotes changes in the system that are adaptive in the sense that they
enable the system to do the same task or tasks drawn from the same population more
efficiently and more effectively the next time” (Simon, 1983, p. 28).
“‘Learning’ is making useful changes in the workings of our minds” (Minsky, 1986,
p. 120).
“Learning is constructing or modifying representations of what is being experienced”
(Michalski, 1986, p. 10).
“A computer program is said to learn from experience E with respect to some class
of tasks T and performance measure P, if its performance at tasks in T, as measured by
P, improves with experience E.” (Mitchell, 1997, p. 2).
Given the preceding definitions, we may characterize learning as denoting the way in
which people and computers:
There are two complementary dimensions of learning: competence and efficiency. A system
is improving its competence if it learns to solve a broader class of problems and to make
fewer mistakes in problem solving. The system is improving its efficiency if it learns to solve
the problems from its area of competence faster or by using fewer resources.
222
13:53:43,
.009
8.1. Introduction to Machine Learning 223
Machine learning is the domain of artificial intelligence that is concerned with building
adaptive computer systems that are able to improve their performance (competence and/or
efficiency) through learning from input data, from a user, or from their own problem-solving
experience.
Research in machine learning has led to the development of many basic learning
strategies, each characterized by the employment of a certain type of:
Rote learning
Version space learning
Decision trees induction
Clustering
Rule induction (e.g., Learning rule sets, Inductive logic programming)
Instance-based strategies (e.g., K-nearest neighbors, Locally weighted regression, Col-
laborative filtering, Case-based reasoning and learning, Learning by analogy)
Bayesian learning (e.g., Naïve Bayes learning, Bayesian network learning)
Neural networks and Deep learning
Model ensembles (e.g., Bagging, Boosting, ECOC, Staking)
Support vector machines
Explanation-based learning
Abductive learning
Reinforcement learning
Genetic algorithms and evolutionary computation
Apprenticeship learning
Multistrategy learning
In the next sections, we will briefly introduce four learning strategies that are particularly
useful for agent teaching and learning.
13:53:43,
.009
224 Chapter 8. Learning for Knowledge-based Agents
An example of a cup
cup(o1): color(o1, white), made-of(o1, plastic), light-mat(plastic),
has-handle(o1), has-flat-bottom(o1), up-concave(o1), ...
Proof that o1 is a cup Generalized proof
cup(o1) cup(x)
light-mat(plastic) light-mat(y)
made-of(o1,plastic) has-handle(o1) made-of(x, y) has-handle(x)
Learned rule
∀x ∀y made-of(x, y), light-mat(y), has-handle(x) ... cup(x)
the agent needs. For instance, the agent does not require any prior knowledge to perform
this type of learning.
The result of this learning strategy is the increase of the problem-solving competence of
the agent. Indeed, the agent will learn to perform tasks it was not able to perform before,
such as recognizing the cups from a set of objects.
13:53:43,
.009
8.1. Introduction to Machine Learning 225
Notice that the agent used the fact that o1 has a handle in order to prove that o1 is a cup.
This means that having a handle is an important feature. On the other hand, the agent did
not use the color of o1 to prove that o1 is a cup. This means that color is not important.
Notice how the agent reaches the same conclusions as in inductive learning from
examples, but through a different line of reasoning, and based on a different type of
information (i.e., prior knowledge instead of multiple examples).
The next step in the learning process is to generalize the proof tree from the left-hand
side of Figure 8.2 into the general tree from the right-hand side. This is done by using the
agent’s prior knowledge of how to generalize the individual inferences from the specific tree.
While the tree from the left-hand side proves that the specific object o1 is a cup, the tree
from the right-hand side proves that any object x that satisfies the leaves of the general tree
is a cup. Thus the agent has learned the general cup recognition rule from the bottom of
Figure 8.2.
To recognize that another object, o2, is a cup, the agent needs only to check that it
satisfies the rule, that is, to check for the presence of these features discovered as
important (i.e., light-mat, has-handle, etc.). The agent no longer needs to build a complex
proof tree. Therefore, cup recognition is done much faster.
Finally, notice that the agent needs only one example from which to learn. However, it
needs a lot of prior knowledge to prove that this example is a cup. Providing such prior
knowledge to the agent is a very complex task.
13:53:43,
.009
226 Chapter 8. Learning for Knowledge-based Agents
causes causes?
similar
B B’
yellow
has as color
Sun nucleus
has as mass has as temp has as mass
The students may then infer that other features of the Solar System are also features of
the hydrogen atom. For instance, in the Solar System, the greater mass of the sun and its
attraction of the planets cause the planets to revolve around it. Therefore, the students may
hypothesize that this causal relationship is also true in the case of the hydrogen atom: The
greater mass of the nucleus and its attraction of the electrons cause the electrons to revolve
around the nucleus. This is indeed true and represents a very interesting discovery.
The main problem with analogical reasoning is that not all the features of the Solar
System are true for the hydrogen atom. For instance, the sun is yellow, but the nucleus is
not. Therefore, the information derived by analogy has to be verified.
13:53:43,
.009
8.2. Concepts 227
8.2 CONCEPTS
13:53:43,
.009
228 Chapter 8. Learning for Knowledge-based Agents
Positive
exception +
Negative
exception –
+
Positive
example
Negative –
example
Most likely a More likely a More likely a Most likely a
positive example positive example negative example negative example
13:53:43,
.009
8.3. Generalization and Specialization Rules 229
In the next sections, we will describe in detail the basic learning operations dealing with
concepts: the generalization of concepts, the specialization concepts, and the comparison of
the generality of the concepts.
A concept was defined as representing a set of instances. In order to show that a concept
P is more general than a concept Q, this definition would require the computation and
comparison of the (possibly infinite) sets of the instances of P and Q. In this section, we
will introduce generalization and specialization rules that will allow one to prove that a
concept P is more general than another concept Q by manipulating the descriptions of
P and Q, without computing the sets of instances that they represent.
A generalization rule is a rule that transforms (the description of) a concept into (the
description of) a more general concept. The generalization rules are usually inductive
transformations. The inductive transformations are not truth preserving but falsity pre-
serving. That is, if P is true and is inductively generalized to Q, then the truth of Q is not
guaranteed. However, if P is false, then Q is also false.
A specialization rule is a rule that transforms a concept into a less general concept. The
reverse of any generalization rule is a specialization rule. Specialization rules are deduct-
ive, truth-preserving transformations.
A reformulation rule transforms a concept into another, logically equivalent concept.
Reformulation rules are also deductive, truth-preserving transformations.
If one can transform concept P into concept Q by applying a sequence of generalization
rules, then Q is more general than P.
Consider the phrase, “Students who have majored in computer science at George
Mason University between 2007 and 2008.” The following are some of the phrases that
are obvious generalizations of this phrase:
“Students who have majored in computer science between 2007 and 2008”
“Students who have majored in computer science between 2000 and 2012”
“Students who have majored in computer science at George Mason University”
“Students who have majored in computer science”
Some of the phrases that are specializations of the preceding phrase follow:
“Graduate students who have majored in computer science at George Mason Univer-
sity between 2007 and 2008”
“Students who have majored in computer science at George Mason University in 2007”
“Undergraduate students who have majored in both computer science and mathemat-
ics at George Mason University in 2008”
13:53:43,
.009
230 Chapter 8. Learning for Knowledge-based Agents
Dropping conditions
Extending intervals
Extending ordered sets of intervals
Extending discrete sets
Using feature definitions
Using inference rules
By replacing 55 with the variable ?N1, which can take any value, we generalize this
concept to the one shown in [8.2]: “The set of professors with any number of publications.”
In particular, ?N1 could be 55. Therefore the second concept includes the first one.
Conversely, by replacing ?N1 with 55, we specialize the concept [8.2] to the concept
[8.1]. The important thing to notice here is that by a simple syntactic operation (turning a
number into a variable), we can generalize a concept. This is one way in which an agent
generalizes concepts.
E1 may be interpreted as representing the concept: “the papers ?O1 and ?O2 authored
by the professor ?O3.” E2 may be interpreted as representing the concept: “the papers ?O1
13:53:43,
.009
8.3. Generalization and Specialization Rules 231
and ?O2 authored by the professors ?O31 and ?O32, respectively.” In particular, ?O31 and
?O32 may represent the same professor. Therefore, the second set includes the first one,
and the second expression is more general than the first one.
13:53:43,
.009
232 Chapter 8. Learning for Knowledge-based Agents
human age
(0.0, 1.0) [1.0, 4.5) [4.5, 12.5) [12.5, 19.5) [19.5, 65.5) [65.5, 150.0]
Figure 8.7. Ordered set of intervals as an ordered generalization hierarchy.
13:53:43,
.009
8.3. Generalization and Specialization Rules 233
13:53:43,
.009
234 Chapter 8. Learning for Knowledge-based Agents
Up to this point we have only defined when a concept is more general than another
concept. Learning agents, however, would need to generalize sets of examples and
concepts. In the following we define some of these generalizations.
To show that [8.23] is more general than [8.22] it is enough to show that [8.22] can be
transformed into [8.23] by applying a sequence of generalization rules. The sequence is the
following one:
13:53:43,
.009
8.4. Types of Generalizations and Specializations 235
person
employee student
graduate graduate
research teaching
assistant assistant
S1 ?O1 instance of graduate research assistant S2 ?O1 instance of graduate teaching assistant
is interested in ?O2 is interested in ?O2
?O2 instance of area of expertise ?O2 instance of area of expertise
C1 ?O1 instance of graduate research assistant C2 ?O1 instance of graduate teaching assistant
is interested in ?O2 is interested in ?O2
?O2 instance of area of expertise ?O2 instance of area of expertise
requires programming requires fieldwork
13:53:43,
.009
236 Chapter 8. Learning for Knowledge-based Agents
13:53:43,
.009
8.4. Types of Generalizations and Specializations 237
Notice, however, that there may be more than one minimal generalization of two
expressions. For instance, according to the generalization hierarchy from the middle of
Figure 8.8, there are two minimal generalizations of graduate research assistant and
graduate teaching assistant. They are university employee and graduate student. Conse-
quently, there are two minimal generalizations of S1 and S2 in Figure 8.11: mG1 and
mG2. The generalization mG1 was obtained by generalizing graduate research assistant and
graduate teaching assistant to university employee. mG2 was obtained in a similar fashion,
except that graduate research assistant and graduate teaching assistant were generalized to
graduate student. Neither mG1 nor mG2 is more general than the other. However, G3 is
more general than each of them.
Disciple agents employ minimal generalizations, also called maximally specific gener-
alizations (Plotkin, 1970; Kodratoff and Ganascia, 1986). They also employ maximal
generalizations, also called maximally general generalizations (Tecuci and Kodratoff,
1990; Tecuci, 1992; Tecuci 1998).
mG1 ?O1 instance of university employee mG2 ?O1 instance of graduate student
is interested in ?O2 is interested in ?O2
?O2 instance of area of expertise ?O2 instance of area of expertise
S1 ?O1 instance of graduate research assistant S2 ?O1 instance of graduate teaching assistant
is interested in ?O2 is interested in ?O2
?O2 instance of area of expertise ?O2 instance of area of expertise
13:53:43,
.009
238 Chapter 8. Learning for Knowledge-based Agents
The minimal specialization of two clauses consists of the minimal specialization of the
matched feature-value pairs, and of all the unmatched feature-value pairs. This procedure
assumes that no new clause feature can be made explicit by applying theorems. Otherwise,
one has first to make all the features explicit.
The minimal specialization of two conjunctions of clauses C1 and C2 consists of the
conjunction of the minimal specializations of each of the matched clauses of C1 and C2,
and of all the unmatched clauses from C1 and C2.
Figure 8.12 shows several specializations of the concepts G1 and G2. mS1 and mS2 are
two minimal specializations of G1 and G2 because graduate research assistant and graduate
teaching assistant are two minimal specializations of university employee and graduate
student.
Notice that in all the preceding definitions and illustrations, we have assumed that the
clauses to be generalized correspond to the same variables. If this assumption is not
satisfied, then one would need first to match the variables and then compute the general-
izations. In general, this process is computationally expensive because one will need to try
different matchings.
Inductive concept learning from examples has already been introduced in Section 8.1.2. In
this section, we will discuss various aspects of this learning strategy that are relevant to
agent teaching and learning. The problem of inductive concept learning from examples
can be more precisely defined as indicated in Table 8.2.
The bias of the learning agent is any basis for choosing one generalization over another,
other than strict consistency with the observed training examples (Mitchell, 1997). In the
following, we will consider two agents that employ two different preference biases: a
cautious learner that always prefers minimal generalizations, and an aggressive learner
that always prefers maximal generalizations.
Let us consider the positive examples [8.28] and [8.29], and the negative example [8.30]
of a concept to be learned by these two agents in the context of the generalization
hierarchies from Figure 8.13.
mS1 ?O1 instance of graduate research assistant mS2 ?O1 instance of graduate teaching assistant
is interested in ?O2 is interested in ?O2
?O2 instance of area of expertise ?O2 instance of area of expertise
C1 ?O1 instance of graduate research assistant C2 ?O1 instance of graduate teaching assistant
is interested in ?O2 is interested in ?O2
?O2 instance of area of expertise ?O2 instance of area of expertise
requires programming requires fieldwork
13:53:43,
.009
8.5. Inductive Concept Learning from Examples 239
Given
A language of instances.
A language of generalizations.
A set of positive examples (E1, . . ., En) of a concept.
A set of negative (or counter) examples (C1, . . ., Cm) of the same concept.
A learning bias.
Other background knowledge.
Determine
A concept description that is a generalization of the positive
examples and that does not cover any of the negative examples.
Purpose of concept learning
Predict if an instance is a positive example of the learned concept.
person
university
state private
university university
computer full associate assistant
technician professor professor professor
Positive examples:
Mark White instance of assistant professor [8.28]
is employed by George Mason University
Negative example:
George Dean instance of computer technician [8.30]
is employed by Stanford University
What concept might be learned by the cautious learner from the positive examples
[8.28] and [8.29], and the negative example [8.30]? The cautious learner would learn a
minimal generalization of the positive examples, which does not cover the negative
example. Such a minimal generalization might be the expression [8.31], “an assistant
professor employed by a state university,” obtained by minimally generalizing George
Mason University and University of Virginia to state university.
13:53:43,
.009
240 Chapter 8. Learning for Knowledge-based Agents
The concept learned by the cautious learner is represented in Figure 8.14 as the
minimal ellipse that covers the positive examples without covering the negative example.
Assuming a complete ontology, the learned concept is included into the actual concept.
How will the cautious learner classify each of the instances represented in Figure 8.14
as black dots? It will classify the dot covered by the learned concept as positive example,
and the two dots that are not covered by the learned concept as negative examples.
How confident are you in the classification, when the learner predicts that an instance
is a positive example? When a cautious learner classifies an instance as a positive example
of a concept, this classification is correct because an instance covered by the learned
concept is also covered by the actual concept.
But how confident are you in the classification, when the learner predicts that an
instance is a negative example? The learner may make mistakes when classifying an
instance as a negative example, such as the black dot that is covered by the actual concept
but not by the learned concept. This type of error is called “error of omission” because
some positive examples are omitted – that is, they are classified as negative examples.
Let us now consider the concept that might be learned by the aggressive learner from
the positive examples [8.28] and [8.29], and the negative example [8.30]. The aggressive
learner will learn a maximal generalization of the positive examples that does not cover the
negative example. Such a maximal generalization might be the expression [8.32], “a
professor employed by a university.” This is obtained by generalizing assistant professor
to professor (the most general generalization that does not cover computer technician from
the negative example) and by maximally generalizing George Mason University and Univer-
sity of Virginia to university. Although university covers Stanford University, this is fine
because the obtained concept [8.32] still does not cover the negative example [8.30].
Concept learned by
the cautious learner
-
Actual concept
+
+
Figure 8.14. Learning and classifications by a cautious learner.
13:53:43,
.009
8.5. Inductive Concept Learning from Examples 241
The concept learned by the aggressive learner is represented in Figure 8.15 as the
maximal ellipse that covers the positive examples without covering the negative example.
Assuming a complete ontology, the learned concept includes the actual concept.
How will the aggressive learner classify each of the instances represented in Figure 8.15
as black dots? It will classify the dot that is outside the learned concept as a negative
example, and the other dots as positive examples.
How confident are you in the classification when the learner predicts that an instance is
negative example? When the learner predicts that an instance is a negative example, this
classification is correct because that instance is not covered by the actual concept, which is
itself covered by the learned concept.
But, how confident are you in the classification when the learner predicts that an
instance is a positive example? The learner may make mistakes when predicting that an
instance is a positive example, as is the case with the dot covered by the learned concept,
but not by the actual concept. This type of error is called “error of commission” because
some negative examples are committed – that is, they are classified as positive examples.
Notice the interesting fact that the aggressive learner is correct when it classifies
instances as negative examples (they are indeed outside the actual concept because they
are outside the concept learned by the aggressive learner) while the cautious learner is
correct when it classifies instances as positive examples (they are inside the actual concept
because they are inside the concept learned by the cautious learner). How could one
synergistically integrate these two learning strategies to take advantage of their comple-
mentariness? An obvious solution is to use both strategies, learning both a minimal and a
maximal generalization from the examples, as illustrated in Figure 8.16.
What class will be predicted by a dual-strategy learner for the instances represented as
black dots in Figure 8.16? The dot covered by the concept learned by the cautious learner
-
Actual concept
+
+
+
+
Concept learned by the cautious learner
Figure 8.16. Learning and classifications by a dual-strategy learner.
13:53:43,
.009
242 Chapter 8. Learning for Knowledge-based Agents
will be classified, with high confidence, as a positive example. The dot that is not covered
by the concept learned by the aggressive learner will be classified, again with high
confidence, as a negative example. The dual-strategy learner will indicate that it cannot
classify the other two dots.
Let us consider the ontology from Figure 8.17. What is the maximal generalization of the
positive examples John Doe and Jane Austin that does not cover the given negative example
Bob Sharp, in the case where graduate research assistant is included into the ontology? The
maximal generalization is faculty member.
But what is the maximal generalization in the case where graduate research assistant is
missing from the ontology? In this case, the maximal generalization is employee, which is,
in fact, an overgeneralization.
What is the minimal specialization of person that does not cover Bob Sharp in the case
where graduate research assistant is included into the ontology? It is faculty member.
But what is the minimal specialization in the case where graduate research assistant is
missing from the ontology? In this case, the minimal specialization is employee, which
is an underspecialization.
person
subconcept of
employee student
subconcept of
subconcept of
university employee
graduate
subconcept of student
undergraduate
staff member faculty member student
subconcept of
subconcept of
subconcept of
instructor professor PhD advisor graduate
research teaching
assistant assistant
subconcept of BS student
John Smith John Doe Jane Ausn Bob Sharp Joan Dean
_
+ +
Figure 8.17. Plausible generalizations and specializations due to ontology incompleteness.
13:53:43,
.009
8.7. Formal Definition of Generalization 243
Notice that the incompleteness of the ontology causes the learner both to overgeneral-
ize and underspecialize. In view of the preceding observations, what can be said about the
relationships between the concepts learned using minimal and maximal generalizations
and the actual concept when the ontology and the representation language are incom-
plete? The minimal and maximal generalizations are only approximations of the actual
concept, as shown in Figure 8.18.
Why is the concept learned with an aggressive strategy more general than the one
learned with a cautious strategy? Because they are based on the same ontology and
generalization rules.
Let be a set of variables. For convenience in identifying variables, their names start
with “?,” as in, for instance, ?O1. Variables are used to denote unspecified instances of
concepts.
Let be a set of constants. Examples of constants are the numbers (such as “5”), strings
(such as “programming”), symbolic probability values (such as “very likely”), and
instances (such as “John Doe”). We define a term to be either a variable or a constant.
Let be a set of features. The set includes the domain independent features
“instance of,” “subconcept of,” and “direct subconcept of,” as well as other domain-
specific features, such as “is interested in.”
Let be an object ontology consisting of a set of concepts and instances defined using the
clause representation [7.2] presented in Section 7.2, where the feature values (vi1 . . . vim)
are constants, concepts, instances, or intervals (numeric or symbolic). That is, there are no
variables in the definition of a concept or an instance from , such as the following one:
13:53:43,
.009
244 Chapter 8. Learning for Knowledge-based Agents
The concepts and the instances from are related by the generalization relations
“instance of” and “subconcept of.” includes the concept object, which represents all
the instances from the application domain and is therefore more general than any
other object concept.
Let be the set of the theorems and the properties of the features, variables, and
constants.
Two properties of any feature are its domain and its range. Other features may have
special properties. For instance, the relation subconcept of is transitive (see Section
5.7). Also, a concept or an instance inherits the features of the concepts that are more
general than it (see Section 5.8).
Let be a set of connectors. includes the logical connectors AND (∧), OR (∨), and
NOT (Except–When), the connectors “{” and “}” for defining alternative values of a
feature, the connectors “[” and “]” as well as “(” and “)” for defining a numeric or a
symbolic interval, the delimiter “,” (a comma), and the symbols “Plausible Upper
Bound” and “Plausible Lower Bound.”
In the preceding expression, each of concept-i . . . concept-n is either an object concept from
the object ontology (such as PhD student), a numeric interval (such as [50, 60]), a set of
numbers (such as {1, 3, 5}), a set of strings (such as {white, red, blue}), a symbolic probability
interval (such as [likely - very likely]), or an ordered set of intervals (such as [youth - mature]).
?Oj, . . . , ?Ol,. . ., ?Op, . . . , ?Ot are distinct variables from the sequence (?O1, ?O2, . . . , ?On).
When concept-n is a set or interval such as “[50, 60],” we use “is in” instead of “instance of.”
A more complex concept is defined as a conjunctive expression “BRU ∧ not BRU1 ∧ . . . ∧
not BRUp,” where “BRU” and each “BRUk (k = 1, . . . , p)” is a conjunction of clauses. This is
illustrated by the following example, which represents the set of instances of the tuple
(?O1, ?O2, ?O3), where ?O1 is a professor employed by a university ?O2 in a long-term
position ?O3, such that it is not true that ?O1 plans to retire from ?O2 or to move to some
other organization:
13:53:43,
.009
8.7. Formal Definition of Generalization 245
Not
?O1 instance of professor
plans to retire from ?O2
?O2 instance of university
Not
?O1 instance of professor
plans to move to ?O4
?O4 instance of organization
C1 = v1 instance of b1
f11 v11
...
f1m v1m
C2 = v2 instance of b2
f21 v21
...
f2n v2n
We say that the clause C1 is more general than the clause C2 if there exists a substitution
σ such that:
σv1 = v2
b1 = b2
8i2{1,. . .,m}, 9 j2{1,. . .,n} such that f1i = f2j and σv1i = v2j.
C1 = ?X instance of student
enrolled at George Mason University
13:53:43,
.009
246 Chapter 8. Learning for Knowledge-based Agents
C2 = ?Y instance of student
enrolled at George Mason University
has as sex female
Indeed, let σ = (?X ?Y). As one can see, σC1 is a part of C2, that is, each feature of
σC1 is also a feature of C2. The first concept represents the set of all students enrolled at
George Mason University, while the second one represents the set of all female students
enrolled at George Mason University. Obviously the first set includes the second one, and
therefore the first concept is more general than the second one.
Let us notice, however, that this definition of generalization does not take into account
the theorems and properties of the representation language . In general, one needs to
use these theorems and properties to transform the clauses C1 and C2 into equivalent
clauses C’1 and C’2, respectively, by making explicit all the properties of these clauses.
Then one shows that C’1 is more general than C’2. Therefore, the definition of the more
general than relation in is the following one:
A clause C1 is more general than another clause C2 if and only if there exist C’1, C’2,
and a substitution σ, such that:
C0 1 ¼L C1
C0 2 ¼L C2
σv1 ¼L v2
In the following sections, we will always assume that the equality is in and we will no
longer indicate this.
A ¼ A1 ∧A2 ∧ . . . ∧An
B ¼ B1 ∧B2 ∧ . . . ∧Bm
A is more general than B if and only if there exist A’, B’, and σ such that:
A0 ¼ A, A0 ¼ A0 1 ∧A0 2 ∧ . . . ∧A0 p
B0 ¼ B, B0 ¼ B0 1 ∧B0 2 ∧ . . . ∧B0 q
Otherwise stated, one transforms the concepts A and B, using the theorems and the
properties of the representation language, so as to make each clause from A’ more general
than a corresponding clause from B’. Notice that some clauses from B’ may be “left over,”
that is, they are not matched by any clause of A’.
A is more general than B if and only if there exist A0 , B0 , and σ, such that:
8 i2f1; . . . ; pg, 9 j2f1; . . . ; qg such that BRU0 bj is more general than σBRU0 ai :
P5
P1 P2
Figure 8.19. Determining concepts that satisfy given “subconcept of” relationships.
13:53:43,
.009
8.8. Review Questions 249
the bodies is not relevant because they can move inside the cell. For example,
((1 green) (2 yellow)) is the same with ((2 yellow) (1 green)) and represents a cell
where one body has one nucleus and is green, while the other body has two nuclei
and is yellow. You should also assume that any generalization of a cell is also
described as a pair of pairs ((s t) (u v)).
(a) Indicate all the possible generalizations of the cell from Figure 8.20 and the
generalization relations between them.
(b) Determine the number of the distinct sets of instances and the number of the
concept descriptions from this problem.
(c) Consider the cell descriptions from Figure 8.21 and determine the following
minimal generalizations: g(E1, E2), g(E2, E3), g(E3, E1), g(E1, E2, E3).
8.14. Consider the ontology fragment from the loudspeaker manufacturing domain,
shown in Figure 5.23 (p. 171), and the following expressions:
8.15. Consider the ontology fragment from the loudspeaker manufacturing domain,
shown in Figure 5.24 (p. 172). Notice that each most specific concept, such as dust
or air press, has an instance, such as dust1 or air press1.
Consider also the following two expressions:
E1: ((1 green) (1 green)) E2: ((1 yellow) (2 green)) E3: ((1 green) (2 green))
Figure 8.21. The descriptions of three cells.
13:53:43,
.009
250 Chapter 8. Learning for Knowledge-based Agents
Use the generalization rules to show that E1 is more general than E2.
8.16. Determine a generalization of the following two expressions in the context of the
ontology fragment from Figure 5.24 (p. 172):
E: ?O instance of object
color yellow
shape circle
radius 5
Indicate five different generalization rules. For each such rule, determine an
expression Eg that is more general than E according to that rule.
polygon round
warm color cold color
13:53:43,
.009
8.8. Review Questions 251
8.19. Consider the following two concepts G1 and G2, and the ontology fragments in
Figure 8.23. Indicate four specializations of G1 and G2 (including a minimal
specialization).
8.20. Illustrate the clause generalization defined in Section 8.7.3 with an example from
the PhD Advisor Assessment domain.
8.21. Illustrate the BRU generalization defined in Section 8.7.4 with an example from the
PhD Advisor Assessment domain.
8.22. Illustrate the generalization of concepts with negations defined in Section 8.7.5 by
using an example from the PhD Advisor Assessment domain.
8.23. Use the definition of generalization based on substitution to prove that each of the
generalization rules discussed in Section 8.3 transforms a concept into a more
general concept.
13:53:43,
.009
9 Rule Learning
In this and the next chapter on rule refinement, we will refer to both problems and
hypotheses, interchangeably, to emphasize the fact that the learning methods presented
are equally applicable in the context of hypotheses analysis and problem solving.
Figure 9.1 summarizes the interactions between the subject matter expert and the learning
agent that involve modeling, learning, and problem solving.
The expert formulates the problem to be solved (or the hypothesis to be analyzed), and
the agent uses its knowledge to generate a (problem-solving or argumentation) tree to be
verified by the expert.
Several cases are possible. If the problem is not completely solved, the expert will extend
the tree with additional reductions and provide solutions for the leaf problems/hypotheses.
P1
Problem Question
Answer
1 1 1
P1 S1 Pn
Mixed-Initiative
Question
Problem Solving Question
Answer
Answer
2 2 2 2
P1 S1 Pm Sm
Question
Reasoning Tree Question
Answer
Answer Ontology + Rules
3 3 3
Extend Tree Accept step P1 S1 Sp
3
Sp
1
Pn
Learned Rules
252
13:59:50,
.010
9.2. Rule Learning and Refinement 253
From each new reduction provided by the expert, the agent will learn a new rule, as will be
presented in the following sections.
If the expert rejects any of the reasoning steps generated by the agent, then an explan-
ation of why that reduction is wrong needs to be determined, and the rule that generated it
will be refined to no longer generate the wrong reasoning step.
If the expert accepts a reasoning step as correct, then the rule that generated it may be
generalized. The following section illustrates these interactions.
As will be discussed in the following, the subject matter expert helps the agent to learn
by providing examples and explanations, and the agent helps the expert to teach it by
presenting attempted solutions.
First, as illustrated in Figure 9.2, the expert formulates the problem to solve or the
hypothesis to analyze which, in this illustration, is the following hypothesis:
In this case, we will assume that the agent does not know how to assess this hypothesis.
Therefore, the expert has to teach the agent how to assess it. The expert will start by
developing a reduction tree, as discussed in Chapter 4 and illustrated in the middle of
Figure 9.2. The initial hypothesis is first reduced to three simpler hypotheses, guided
by a question/answer pair. Then each of the subhypotheses is further reduced, either
to a solution/assessment or to an elementary hypothesis to be assessed based on
evidence. For example, the bottom part of Figure 9.2 shows the reduction of the first
subhypothesis to an assessment.
After the reasoning tree has been developed, the subject matter expert interacts with
the agent, helping it “understand” why each reduction step is correct, as will be discussed
in Section 9.5. As a result, from each reduction step the agent learns a plausible version
1. Modeling 2. Learning
Expert explains Agent learns
how to solve a general
specific problem reducon rules
Rule1
Rule2
13:59:50,
.010
254 Chapter 9. Rule Learning
space rule, as a justified generalization of it. This is illustrated in the right-hand side of
Figure 9.2 and discussed in Section 9.7. These rules are not shown to the expert, but they
may be viewed with the Rule Browser.
The agent can now use the learned rules to assess by itself similar hypotheses formu-
lated by the expert, as illustrated in Figure 9.3, where the expert formulated the following
hypothesis:
The reduction tree shown in Figure 9.3 was generated by the agent. Notice how the agent
concluded that Bob Sharp is interested in an area of expertise of Dan Smith, which is
Information Security, by applying the rule learned from John Doe and Bob Sharp, who share
a common interest in Artificial Intelligence.
The expert has to inspect each reduction generated by the agent and indicate whether it
is correct or not. Because the reductions from Figure 9.3 are correct, the agent generalizes
the lower bound conditions of the applied rules, if the reductions were generated based on
the upper bound conditions of these rules.
The bottom part of Figure 9.4 shows a reduction generated by the agent that is rejected
by the expert. While Dan Smith has indeed a tenured position, which is a long-term faculty
position, he plans to retire. It is therefore wrong to conclude that it is almost certain that he
will stay on the faculty of George Mason University for the duration of the dissertation of
Bob Sharp.
Such failure explanations are either proposed by the agent and accepted by the expert,
or are provided by the expert, as discussed in Section 9.5.2.
Based on this failure explanation, the agent specializes the rule that generated this
reduction by adding an Except-When plausible version space condition, as illustrated in
the right-hand side of Figure 9.4. From now on, the agent will check not only that the
faculty member has a long-term position (the main condition of the rule), but also that he
or she does not plan to retire (the Except-When condition). The refined rule is not shown
to the expert, but it may be viewed with the Rule Browser.
3. Solving 5. Refinement
Expert accepts
reasoning
13:59:50,
.010
9.2. Rule Learning and Refinement 255
1. Solving 3. Refinement
Agent refines
rule with
negative
example
2. Critiquing
Incorrect
because
Dan Smith
plans to retire
- +
1. Solving 3. Refinement
2. Critiquing
- +
Incorrect
because
Jane Austin
+ -
plans to move -
Figure 9.5 shows another reasoning tree generated by the agent for an expert-
formulated hypothesis. Again the expert rejects one of the reasoning steps: Although
Jane Austin has a tenured position and does not plan to retire, she plans to move from
George Mason University and will not stay on the faculty for the duration of the dissertation
of Bob Sharp.
Based on this failure explanation, the agent specializes the rule that generated the
reduction by adding an additional Except-When plausible version space condition, as
shown in the right-hand side of Figure 9.5. From now on, the agent will check not only that
the faculty member has a long-term position, but also that he or she does not plan to retire
or move from the university.
13:59:50,
.010
256 Chapter 9. Rule Learning
The refined rule is shown in Figure 9.6. Notice that this is a quite complex rule that was
learned based only on one positive example, two negative examples, and their explan-
ations. The rule may be further refined based on additional examples.
The following sections describe in more detail the rule-learning and refinement pro-
cesses. Before that, however, let us notice a significant difference between the develop-
ment of a knowledge-based learning agent and the development of a (nonlearning)
knowledge-based agent. As discussed in Sections 1.6.3.1 and 3.1, after the knowledge base
of the (nonlearning) agent is developed by the knowledge engineer, the agent is tested
with various problems. The expert has to analyze the solutions generated by the agent, and
the knowledge engineer has to modify the rules manually to eliminate any identified
problems, testing the modified rules again.
+
-
-
13:59:50,
.010
9.3. The Rule-Learning Problem 257
In the case of a learning agent, both rule learning and rule refinement take place as part
of agent teaching. Testing of the agent is included into this process. This process will also
continue as part of knowledge base maintenance. If we would like to extend the agent
to solve new problems, we simply need to teach it more. Thus, in the case of a learning
agent, such as Disciple-EBR, there is no longer a distinction between knowledge base
development and knowledge base maintenance. This is very important because it is well
known that knowledge base maintenance (and system maintenance, in general) is much
more challenging and time consuming than knowledge base (system) development.
Thus knowledge base development and maintenance are less complex and much faster
in the case of a learning agent.
The rule-learning problem is defined in Table 9.1 and is illustrated in Figures 9.7 and 9.8.
The agent receives an example of a problem or hypothesis reduction and learns a plausible
version space rule that is an analogy-based generalization of the example. There is no
restriction with respect to what the example actually represents. However, it has to be
described as a problem or hypothesis that is reduced to one or several subproblems,
elementary hypotheses, or solutions. Therefore, this example may also be referred to as a
problem-solving episode. For instance, the example shown in the top part of Figure 9.8
reduces a specific hypothesis to its assessment or solution, guided by a question and
its answer.
The expert who is training the agent will interact with it to help it understand why the
example is a correct reduction. The understanding is done in the context of the agent’s
ontology, a fragment of which is shown in Figure 9.7.
The result of the rule-learning process is a general plausible version space rule that will
allow the agent to solve problems by analogy with the example from which the rule was
learned. The plausible version space rule learned from the example at the top of Figure 9.8
is shown at the bottom part of the figure. It is an IF-THEN structure that specifies the
conditions under which the problem from the IF part has the solution from the THEN part.
The rule is only partially learned because, instead of a single applicability condition, it has
two conditions:
GIVEN
A knowledge base that includes an ontology and a set of (previously learned) rules
An example of a problem reduction expressed with the concepts and instances from the agent’s
knowledge base
An expert who will interact with the agent to help it understand why the example is correct
DETERMINE
A plausible version space rule, where the upper bound is a maximal generalization of the
example, and the lower bound is a minimal generalization that does not contain any specific
instance
An extended ontology, if any extension is needed for the understanding of the example
13:59:50,
.010
258 Chapter 9. Rule Learning
object
A plausible upper bound condition that is a maximal generalization of the instances and
constants from the example (e.g., Bob Sharp, certain), in the context of the agent’s ontology
A plausible lower bound condition that is a minimal generalization that does not
contain any specific instance
The relationships among the variables ?O1, ?O2, and ?O3 are the same for both conditions
and are therefore shown only once in Figure 9.8, under the conditions.
Completely learning the rule means learning an exact condition, where the plausible
upper bound is identical with the plausible lower bound.
During rule learning, the agent might also extend the ontology with new features or
concepts, if they are needed for the understanding the example.
An overview of the rule-learning method is presented in Figure 9.9 and in Table 9.2. As in
explanation-based learning (DeJong and Mooney, 1986; Mitchell et al., 1986), it consists of
two phases: an explanation phase and a generalization phase. However, in the explanation
phase the agent does not automatically build a deductive proof tree but an explanation
structure through mixed-initiative understanding. Also, the generalization is not a deduct-
ive one, but an analogy-based one.
In the following, we will describe this learning method in more detail and illustrate it.
First we will present the mixed-initiative process of explanation generation and example
13:59:50,
.010
9.4. Overview of the Rule-Learning Method 259
Rule learning
Mixed-Initiative Analogy-based
Understanding Generalization
Plausible version
space rule
Example of a
reduction step
Explanation
NLP, Analogy
Knowledge Base
13:59:50,
.010
260 Chapter 9. Rule Learning
understanding, which is part of the first phase. Then we will present and justify the
generalization method, which is based on analogical reasoning.
13:59:50,
.010
9.5. Mixed-Initiative Example Understanding 261
is interested in is expert in
13:59:50,
.010
262 Chapter 9. Rule Learning
Because the agent’s ontology is incomplete, sometimes the explanation includes only
an approximate representation of the meaning of the question/answer (natural language)
sentences.
The next section presents the explanation generation method.
It is easier for an expert to understand sentences in the formal language of the agent
than it is to produce such formal sentences
It is easier for the agent to generate formal sentences than it is to understand sentences
in the natural language of the expert
In essence, the agent will use basic natural language processing, various heuristics,
analogical reasoning, and help from the expert in order to identify and propose a set of
plausible explanation pieces, ordered by their plausibility of being correct explanations.
Then the expert will select the correct ones from the generated list.
The left-hand side of Figure 9.11 shows an example to be understood, and the upper-
right part of Figure 9.11 shows all the instances and constants from the example. The agent
will look for plausible explanation pieces of the types from Table 9.3, involving those
instances and constants. The most plausible explanation pieces identified, in plausibility
13:59:50,
.010
9.5. Mixed-Initiative Example Understanding 263
order, are shown in the bottom-right of Figure 9.11. Notice that the two most plausible
explanation pieces from Figure 9.11 are the correct explanation pieces shown in Figure 9.10.
The expert will have to select each of them and click on the Accept button. As a result, the
agent will move them in the Explanations pane from the left side of Figure 9.11.
Notice in the upper-right of Figure 9.11 that all the objects and constants from the example
are selected. Consequently, the agent generates the most plausible explanations pieces
related to all these objects and displays those with the highest plausibility. The expert may
click on the See More button, asking the agent to display the next set of plausible explanations.
The expert may also deselect some of the objects and constants, asking the agent to
generate only plausible explanations involving the selected elements. For example,
Figure 9.12 illustrates a situation where only the constant “certain” is selected. As a result,
the agent generated only the explanation “The value is specifically certain,” which means
that this value should be kept as such (i.e., not generalized) in the learned rule.
The expert may also provide a new explanation, even using new instances, concepts, or
features. In such a case, the expert should first define the new elements in the ontology.
After that, the expert may guide the agent to generate the desired explanations.
If the example contains any generic instance, such as “Artificial Intelligence,” the agent
will automatically select the explanation piece “Artificial Intelligence is Artificial Intelligence”
(see Explanation pane on the left side of Figure 9.11), meaning that this instance will
appear as such in the learned rule. If the expert wants Artificial Intelligence to be general-
ized, he or she should simply remove that explanation by clicking on it and on the Remove
button at its right.
The expert may also define explanations involving functions and comparisons, as will
be discussed in Sections 9.12.3 and 9.12.4.
Notice, however, that the explanation of the example may still be incomplete for at least
three reasons:
The ontology of the agent may be incomplete, and therefore the agent may not be able
to propose all the explanation pieces of the example simply because they are not
present in the ontology
1. Select an object
2. Click on “Search”
13:59:50,
.010
264 Chapter 9. Rule Learning
The agent shows the plausible explanation pieces incrementally, as guided by the
expert, and if one of the actual explanation pieces is not among the first ones shown, it
may not be seen and selected by the expert
It is often the case that the human expert forgets to provide explanations that corres-
pond to common-sense knowledge that also is not represented in the question/
answer pair
The incompleteness of the explanation is not, however, a significant problem because the
explanation may be further extended during the rule refinement process, as discussed
in Chapter 10.
To conclude, Table 9.4 summarizes the mixed-initiative explanation generation method.
Once the expert is satisfied with the identified explanation pieces, the agent will generate the
rule, as discussed in the following sections.
As indicated in Table 9.2 (p. 260), once the explanation of the example is found, the agent
generates a very specific IF-THEN rule with an applicability condition that covers only that
example. The top part of Figure 9.13 shows an example, and the bottom part shows the
generated specific rule that covers only that example. Notice that each instance (e.g., Bob
Sharp) and each constant (e.g., certain) is replaced with a variable (i.e., ?O1, ?Sl1).
However, the applicability condition restricts the possible values of these variables to
those from the example (e.g., “?O1 is Bob Sharp”). The applicability condition also includes
the properties and the relationships from the explanation. Therefore, the rule from the
bottom of Figure 9.13 will cover only the example from the top of Figure 9.13. This rule will
be further generalized to the rule from Figure 9.8, which has a plausible upper bound
condition and a plausible lower bound condition, as discussed in the next section. In
particular, the plausible upper bound condition will be obtained as the maximal general-
ization of the specific condition in the context of the agent’s ontology. Similarly, the
plausible lower bound condition will be obtained as the minimal generalization of the
specific condition that does not contain any specific instance.
Let E be an example.
Repeat
The expert focuses the agent’s attention by selecting some of the instances and constants from
the example.
The agent proposes what it determines to be the most plausible explanation pieces related to
the selected entities, ordered by their plausibility.
The expert chooses the relevant explanation pieces.
The expert may ask for the generation of additional explanation pieces related to the selected
instances and constants, may select different ones, or may directly specify explanation pieces.
13:59:50,
.010
9.7. Analogy-based Generalization 265
Example
Specific rule
13:59:50,
.010
266 Chapter 9. Rule Learning
explains explains?
initial example similar example
I need to I need to
Bob Sharp is interested in an area of Peter Jones is interested in an area of
expertise of John Doe. expertise of Dan Smith.
similar
Therefore I conclude that Therefore I conclude that
It is certain that Bob Sharp is interested It is certain that Peter Jones is interested
in an area of expertise of John Doe. in an area of expertise of Dan Smith.
they are both less general then a given expression that represents the analogy criterion.
Consequently, the preceding question may be rephrased as:
Given the explanation EX of an example E, which generalization of EX should be
considered an analogy criterion, enabling the agent to generate reductions that are analo-
gous to E?
There are two interesting answers of this question, one given by a cautious learner, and
the other given by an aggressive learner, as discussed in the next sections.
13:59:50,
.010
9.7. Analogy-based Generalization 267
object
Most general generalization
?O1 is person
actor area of UB is interested in ?O3
expertise ?O2 is person
person UB
is expert in ?O3
subconcept of
?O3 is area of expertise
employee student ?SI1 is in [certain – certain]
Computer
Science Maximal
university employee generalization
instance of Specif ic condit ion
faculty member graduate ?O1 is Bob Sharp
student Artificial is interested in ?O3
subconcept of
Intelligence LB ?O2 is John Doe
professor PhD advisor ?O3 is expert in ?O3
LB ?O3 is Artificial Intelligence
subconcept of subconcept of
is expert in ?SI1 is exactly certain
associate LB PhD student LB domain person
professor range area of expert ise
instance of instance of is interested in
John Doe Bob Sharp domain person
range area of expertise
?O2 ?O1
Figure 9.16. Maximal generalization of the specific applicability condition.
Now consider John Doe. Its most general generalization is object \ domain(is expert in) =
object \ person = person.
Consider now Artificial Intelligence. It appears as a value of the features “is interested in”
and “is expert in.” Therefore, its maximal generalization is: object \ range(is interested in) \
range(is expert in) = object \ area of expertise \ area of expertise = area of expertise.
13:59:50,
.010
268 Chapter 9. Rule Learning
On the other hand, the maximal generalization of “certain” is the interval with a single
value “[certain – certain]” because ?Sl1 is restricted to this value by the feature “is exactly.”
Let us consider again the example from the top part of Figure 9.13, but now let us
assume that “The value is specifically certain” was not identified as an explanation piece.
That is, the explanation of the example consists only of the following pieces:
In this case, the generated specific condition is the one from the bottom of Figure 9.17, and
its maximal generalization is the one from the top of Figure 9.17. Notice that the maximal
generalization of “certain” is the entire interval [no support – certain] because there is no
restriction on the possible values of ?Sl1.
Figure 9.17. Maximal generalization of a symbolic probability value when no explanation is identified.
13:59:50,
.010
9.7. Analogy-based Generalization 269
Analogy associate
criterion PhD student PhD advisor
professor
instance of instance of
Artificial Intelligence
?O1 ?O2
is interested in is is expert in
?O3
Probability of solution is always certain
Similarly, the specific instance John Doe is minimally generalized to PhD advisor or
associate professor, because these are the minimal generalizations of John Doe and neither
is more specific than the other. Additionally, both these concepts are subconcepts of
person, the domain of is expert in.
Because Artificial Intelligence is a generic instance, it can appear in the learned rule (as
opposed to the specific instances Bob Sharp and John Doe). Therefore, its minimal general-
ization is Artificial Intelligence itself. Similarly, the constants (such as certain) can also appear
in the learned rule, and they are kept as such in the minimal generalization.
13:59:50,
.010
270 Chapter 9. Rule Learning
Notice that if you want an instance to appear in the condition of a learned rule, it needs
to be defined as a generic instance. Specific instances are always generalized to concepts
and will never appear in the condition.
The partially learned rule is shown in the bottom part of Figure 9.8 (p. 259). Notice that the
features are listed only once under the bounds because they are the same for both bounds.
The generated rule is analyzed to determine whether there are any variables in the
THEN part that are not linked to some variable from the IF part. If such an unlinked
variable exists, then it can be instantiated to any value, leading to solutions that make no
sense. Therefore, the agent will interact with the expert to find an additional explanation
that will create the missing link and update the rule accordingly.
The generated rule is also analyzed to determine whether it has too many instances in
the knowledge base, which is also an indication that its explanation is incomplete and
needs to be extended.
The rule learned from an example and its explanation depends on the ontology of the
agent at the time the rule was generated. If the ontology changes, the rule may need to be
updated, as will be discussed in Chapter 10. For example, the minimal generalization of a
specific instance will change if a new concept is inserted between that instance and the
concept above it. To enable the agent to update its rules automatically when relevant
changes occur in the ontology, minimal generalizations of the examples and their explan-
ations are associated with the learned rules.
Why is the agent maintaining minimal generalizations of examples instead of the
examples themselves? Because the examples exist only in Scenario KBs, where the specific
instances are defined, while the rules are maintained in the Domain KB. If a scenario is no
longer available, the corresponding examples are no longer defined. However, generalized
examples (which do not contain specific instances) will always be defined in the Domain
KB. Thus the generalized examples represent a way to maintain a history of how a rule was
learned, independent of the scenarios. They are also a compact way of preserving this
history because one generalized example may correspond to many actual examples.
Figure 9.20 shows the minimal generalization of the example and its explanation from
which the rule in Figure 9.8 (p. 259) was learned.
One should notice that the minimal generalization of the example shown at the top part
of Figure 9.20 is not the same with the plausible lower bound condition of the learned rule
from Figure 9.8. Consider the specific instance John Doe from the example.
In the ontology, John Doe is both a direct instance of PhD advisor and of associate
professor (see Figure 9.16). In the lower bound condition of the rule, John Doe is general-
ized to PhD advisor or associate professor, indicated as (PhD advisor, associate professor),
because each of these two concepts is a minimal generalization of John Doe in the
ontology. Thus the agent maintains the two concepts as part of the lower bound of the
rule’s version space: one corresponding to PhD advisor, and the other corresponding to
13:59:50,
.010
9.10. Hypothesis Learning 271
Generalized Example
?O1 is PhD student
is interested in ?O3
?O2 is PhD advisor
is associate professor
is expert in ?O3
?O3 is Artificial Intelligence
?SI1 is in [certain - certain]
Covered positive examples: 1
Covered negative examples: 0
Minimal example
generalization
Example and its explanation
?O1 is Bob Sharp
is interested in ?O3
?O2 is John Doe
is expert in ?O3
?O3 is Artificial Intelligence
?SI1 is exactly certain
associate professor. During further learning, the agent will choose one of these generaliza-
tions or a more general generalization that covers both of them.
In the minimal generalization of the example, John Doe is generalized to PhD advisor
and associate professor because this is the best representation of the minimal generaliza-
tion of the example that can be used to regenerate the rule, when changes are made to the
ontology. This minimal generalization is expressed as follows:
?O2 is PhD advisor
is associate professor
Initially, the generalized example shown at the top of Figure 9.20 covers only one
specific example. However, when a new (positive or negative) example is used to refine
the rule, the agent checks whether it is already covered by an existing generalized example
and records this information. Because a generalized example may cover any number of
specific positive and negative examples, its description also includes the number of
specific examples covered, as shown in the top part of Figure 9.20.
Cases of using generalized examples to regenerate previously learned rules are pre-
sented in Section 10.2.
In addition to learning a general reduction rule from a specific reduction example, Disciple-
EBR also learns general hypotheses (or problems). The left-hand side of Figure 9.21 shows
13:59:50,
.010
Hypothesis Generalizaon
1. Modeling 2. Learning
Reducon Example
.010
13:59:50,
Figure 9.21. A reduction rule and four hypotheses learned from a specific hypothesis reduction.
9.10. Hypothesis Learning 273
The hypothesis learning method, shown in Table 9.5, is very similar with the rule-learning
method.
Figure 9.22 illustrates the automatic learning of a general hypothesis from the specific
hypothesis, “John Doe would be a good PhD advisor for Bob Sharp,” when no explanation is
provided.
The specific instances, John Doe and Bob Sharp, are replaced with the variables ?O1 and
?O2, respectively, as in the reduction rule.
The lower bounds of these variables are obtained as the minimal generalizations of
John Doe and Bob Sharp, according to the agent’s ontology from the left-hand side of
Figure 9.22, because both of them are specific instances. Notice that there are two minimal
generalizations of John Doe: PhD advisor and associate professor. The minimal generaliza-
tion of Bob Sharp is PhD student.
The upper bounds are obtained as the maximum generalizations of John Doe and
Bob Sharp, according to the agent’s ontology from the left-hand side of Figure 9.22. They
are both object.
During the explanation generation process, the user may wish to restrict the general-
ization of the hypothesis by providing the following explanations:
13:59:50,
.010
274 Chapter 9. Rule Learning
object UB
Most specific generalization Most general generalization
?O1 is (PhD advisor, associate professor) ?O1 is object
actor
?O2 is PhD student ?O2 is object
associate
LB PhD student LB
professor
instance of instance of
In this case, the lower bound condition of the learned hypothesis remains the same, but
the upper bound condition becomes:
13:59:50,
.010
9.11. Hands On: Rule and Hypotheses Learning 275
Hypothesis Pattern or the Learn Tree Patterns commands that were introduced in
Section 4.10. Finally, it may be learned by specifically invoking hypothesis (problem)
learning when working with the Mixed-Initiative Reasoner. But it is only this last situation
that also allows the definitions of explanations, as will be discussed later in this section. In
all the other situations, a hypothesis is automatically learned, with no explanations, as was
illustrated in Figure 9.22.
The overall user–agent interactions during the hypothesis explanation process are
illustrated in Figure 9.23 and described in Operation 9.1. Here it is assumed that the
reasoning tree was already formalized and thus a hypothesis pattern was already learned.
If it was not learned, the pattern will be automatically learned before the explanations are
identified.
This case study will guide you to use Disciple-EBR to learn rules and hypotheses from
examples. More specifically, you will learn how to:
The overall user–agent interactions during the rule- (and hypotheses-) learning process
are illustrated in Figure 9.24 and described in Operation 9.2. It is assumed that the
13:59:50,
.010
1. In “Reasoning Hierarchy” (a) Select Hypothesis 4. In “Reasoning Step” (b) Accept Explanation Pieces
6. Click on “Accept”
5. Select a relevant
3. Click on explanation piece
“Modify Explanations”
2. Click on a
hypothesis to select it
Figure 9.23. Overview of the user–agent interactions during the hypothesis explanation process.
1. In “Reasoning Hierarchy” 4. In “Reasoning Step” (b) Accept Explanation Pieces
(a) Select Reasoning Step
6. Select a relevant
explanation piece 7. Click on “Accept”
3. Click on
2. Click on question/answer “Learn Condition” 5. If necessary, select and remove any
to select the reasoning step automatically selected explanation piece
Figure 9.24. Overview of the user–agent interactions during rule (and hypotheses) learning.
278 Chapter 9. Rule Learning
reasoning tree is formalized. If not, one can easily formalize it in the Evidence workspace
before invoking rule learning by simply right-clicking on the top node and selecting
Learn Tree.
You are now ready to perform a rule-learning case study. There are two of them, a shorter
one and a longer one. In the shorter case study, you will guide the agent to learn the rule
from Figure 9.8, as discussed in the previous sections. In the longer case study, you will
guide the agent to learn several rules, including the rule from Figure 9.8.
Start Disciple-EBR, select one of the case study knowledge bases (either “11-Rule-Learning-
short/Scen” or “11-Rule-Learning/Scen”), and proceed as indicated in the instructions at the
bottom of the opened window.
A learned rule can be displayed as indicated in the following operation.
13:59:50,
.010
9.12. Explanation Generation Operations 279
At the top of the Rule Viewer, notice the name of the rule (e.g., DDR.00018). You may
also display or delete this rule with the Rule Browser, as described in Operations 10.4
and 10.5.
Click on the X button of the Rule Viewer to close it.
1. Select an object
2. Click on “Search”
13:59:50,
.010
280 Chapter 9. Rule Learning
Select one or several entities from the “Elements to search for” pane, such as certain in
Figure 9.25. You may also need to deselect some entities by clicking on them.
Click on the Search button, asking the agent to generate explanation pieces related to
the selected entities.
Select an explanation piece and click on the Accept button.
Click on the See More button to see more of the generated explanation pieces.
Repeat the preceding steps until all the desired explanations are generated and selected.
Notice that these types of explanation pieces are generated when only constants or generic
instances are selected in the “Elements to search for” pane. For example, only “certain”
was selected in the “Elements to search for” pane in the upper-right part of Figure 9.25,
and therefore the potential explanation piece “The value is specifically certain” was
generated.
Notice also that explanations such as “Artificial Intelligence is Artificial Intelligence” can be
generated only for generic instances. They cannot be generated for specific instances
because these instances are always generalized in the learned rules.
13:59:50,
.010
9.12. Explanation Generation Operations 281
using relationships from the agent’s ontology. They are the following ones, and are shown
also in the left-hand side of Figure 9.27:
13:59:50,
.010
282 Chapter 9. Rule Learning
Additionally, you have to teach the agent how the price is actually computed. You invoke
the Expression Editor by clicking on the Edit Expression button, which displays a pane to
define the expression (see the bottom right of Figure 9.27). Then you fill in the left side of
the equality with the price, and the right side with the expression that leads to this price, by
using the numbers from the example:
Figure 9.28. Learned rule with a learned function in the applicability condition.
13:59:50,
.010
9.12. Explanation Generation Operations 283
Additionally, you have to indicate that 650.35 is greater than 519.75. You click on the
Create New… button, which opens a window allowing you to define a new explanation as
an object-feature-value triplet (see the bottom of Figure 9.29). In the left editor, you
start typing the amount of money Mike has (i.e., 650.35) and select it from the completion
13:59:50,
.010
284 Chapter 9. Rule Learning
pop-up. In the center editor, you type >=. Then, in the right editor, you start typing the
actual cost (519.75) and select it from the completion pop-up. Finally, you click on the OK
button in the Create explanation window to select this explanation:
13:59:50,
.010
9.13. Guidelines for Rule and Hypothesis Learning 285
In the center editor, type the comparison operator (<, <=, =, !=, >=, or >).
In the right editor, type the number corresponding to the right side of the comparison.
Click on the OK button in the Create explanation window to accept the explanation.
Guideline 9.1. Properly identify all the entities in the example before
starting rule learning
Before starting rule learning, make sure that all the elements are properly recognized as
instances, numbers, symbolic intervals, or strings. This is important because only the
entities with one of these types will be replaced with variables as part of rule learning, as
shown in the top part of Figure 9.31. Recognizing concepts is also recommended, but it is
optional, since concepts are not generalized. However, recognizing them helps the agent
in explanation generation.
Notice the case from the middle part of Figure 9.31. Because “the United States”
appears as text (in black) and not as instance (in blue), it will not be replaced with a
variable in the learned rule. A similar case is shown at the bottom of Figure 9.31. Because
600.0 is not recognized as number (in green), it will appear as such in the learned rule,
instead of being generalized to a variable.
13:59:50,
.010
286 Chapter 9. Rule Learning
Guideline 9.2. Avoid learning from examples that are too specific
It is important to teach the agent with good examples from which it can learn general
rules. A poor example is illustrated in the upper-left part of Figure 9.32. In this case, the
amount of money that Mike has is the same as the price of the Apple iPad 16GB. As a result,
both occurrences of 519.75 are generalized to the same variable ?N1, and the agent will
learn a rule that will apply only to cases where the amount of money of the buyer is exactly
the same as the price of the product (see the upper-right part of Figure 9.32).
You need instead to teach the agent with an example where the numbers are different,
such as the one from the bottom-left part of Figure 9.32. In this case, the agent will
generalize the two numbers to two different variables. Notice that the learned rule will
also apply to cases where ?N1 = ?N2.
Therefore, before starting rule learning, review the modeling and check that you have
defined the features suggested by the Q/A pair in the ontology.
What are the other explanation pieces for this reduction step? You will need to define
two additional explanation pieces involving comparisons, as well as explanation pieces
fixing the values 41, 53, and very likely, as shown in the left-hand side of Figure 9.34. The
learned rule is shown in the right-hand side of Figure 9.34.
Does Mike have enough money Learning Does ?O1 have enough money
to buy Apple iPad 16GB? to buy ?O2?
Yes, because Mike has 519.75 Yes, because ?O1 has ?N1
dollars and Apple iPad 16GB dollars and ?O2
costs 519.75 dollars. costs ?N1 dollars.
Does Bob have enough money Does ?O1 have enough money
to buy Apple iPad 16GB? to buy ?O2?
Yes, because Bob has 620.25 Yes, because ?O1 has ?N1
dollars and Apple iPad 16GB dollars and ?O2
costs 519.75 dollars. costs ?N2 dollars.
13:59:50,
.010
9.13. Guidelines for Rule and Hypothesis Learning 287
When you define a new feature, make sure you define both the domain and the range.
Do not leave the ones generated automatically. The automatically generated range (Any
Element) is too general, and features with that range are not even used in explanations.
Therefore, make sure that you select a more specific domain and range, such as a concept,
a number interval, or a symbolic interval.
If you have already defined facts involving that feature, you need to remove them first
before you can change the domain and the range of the feature.
13:59:50,
.010
288 Chapter 9. Rule Learning
13:59:50,
.010
9.15. Review Questions 289
Learn rules from the reasoning trees developed in the previous project assignments.
9.3. Consider the following expression, where both Jane Austin and Bob Sharp are
specific instances:
Find its minimal generalization that does not contain any instance, in the context
of the ontological knowledge from Figure 9.35. Find also its maximal generalization.
9.4. Consider the following explanation of a reduction:
9.6. Consider the ontological knowledge from Figure 9.37, where Dana Jones, Rutgers
University, and Indiana University are specific instances.
(a) What are the minimal generalization and the maximal generalization of the
following expression?
13:59:50,
.010
290 Chapter 9. Rule Learning
object
subconcept of
has as advisor
actor research area domain student
range faculty member
subconcept of
organization person
subconcept of
employee student
subconcept of
subconcept of
graduate undergraduate
faculty member staff member student student
subconcept of subconcept of
subconcept of
instructor professor PhD advisor graduate graduate subconcept of
research teaching
subconcept of assistant assistant BS student
John Smith John Doe Jane Austin Bob Sharp Joan Dean
university
faculty member
instance of
subconcept of
George Mason Indiana
professor PhD advisor University University
subconcept of
13:59:50,
.010
9.15. Review Questions 291
university
faculty member instance of
subconcept of instance of
Rutgers Indiana
professor PhD advisor University University
Dana Jones
(b) What are the minimal generalization and the maximal generalization of the
following expression?
9.7. Consider the example problem reduction and its explanation from Figure 9.38.
Which is the specific rule condition covering only this example? What rule will be
learned from this example and its explanation, assuming the ontology fragment
13:59:50,
.010
292 Chapter 9. Rule Learning
from Figure 9.39? What general problem will be learned from the specific IF
problem of this reduction?
Notice that some of the instances are specific (e.g., Aum Shinrikyo and Masami
Tsuchiya), while others are generic (e.g., chemistry).
9.8. Consider the problem reduction example and its explanation from Figure 9.40.
Which is the specific rule covering only this example? What rule will be learned
from this example and its explanation, assuming the ontology fragment from
Figure 9.41? What general problems will be learned from this example? Assume
that all the instances are specific instances.
has as member
domain organization
range person
13:59:50,
.010
9.15. Review Questions 293
object
is tesmony by is tesmony about
subconcept of domain evidence domain evidence
range source range evidence
actor source evidence
subconcept of subconcept of
tesmonial evidence
person
subconcept of subconcept of
direct tesmonial
author terrorist
item of evidence evidence
instance of instance of subconcept of
subconcept of
Hamid Mir Osama bin Laden tesmonial tesmonial evidence
evidence obtained non-elementary elementary piece based on direct
interview at second hand piece of evidence of evidence observaon
instance of instance of
EVD-Dawn-Mir-01-01 EVD-Dawn-Mir-01-01c
9.9. Compare the rule-learning process with the traditional knowledge acquisition
approach, where a knowledge engineer defines such a rule by interacting with
a subject matter expert. Identify as many similarities and differences as possible,
and justify the relative strengths and weaknesses of the two approaches, but be as
concise as possible.
13:59:50,
.010
10 Rule Refinement
Regardless of the origin of the example, the goal of the agent is to refine the rule to be
consistent with the example. A possible effect of rule refinement is the extension of the
ontology.
The rule refinement problem is defined in Table 10.1 and an overview of the rule
refinement method is presented in the next section.
GIVEN
A plausible version space reduction rule
A positive or a negative example of the rule (i.e., a correct or an incorrect reduction)
A knowledge base that includes an ontology and a set of (previously learned) reduction rules
An expert who will interact with the agent, helping it understand why the example is positive
(correct) or negative (incorrect)
DETERMINE
An improved rule that covers the example if it is positive, and does not cover the example if it is
negative
An extended ontology, if this is needed for rule refinement
294
13:54:13,
.011
10.1. Incremental Rule Refinement 295
Learning by Analogy
Knowledge Base and Experimentation Rule
Rule’s condition IF we have to solve
– <Problem>
– +
+ Main
PVS Condition
Except-When
Failure PVS Condition
explanation
Examples of problem reductions
THEN solve
generated by the agent <Subproblem 1>
…
<Subproblem m>
Incorrect Correct
example example
Learning from
Explanations Learning from Examples
Figure 10.1. Multistrategy rule refinement.
13:54:13,
.011
296 Chapter 10. Rule Refinement
version space condition may be learned, starting from that negative example and its failure
explanation. This plausible version space Except-When condition is represented by the red
ellipses at the top of Figure 10.1.
The refined rule is shown in the right-hand side of Figure 10.1. The applicability
condition of a partially learned rule consists of a main applicability condition and zero,
one, or more Except-When conditions. The way the rule is refined based on a new
example depends on the type of the example (i.e., positive or negative), on its position
with respect to the current conditions of the rule, and on the type of the explanation of
the example (if identified). The refinement strategies will be discussed in more detail in
the next sections by considering a rule with a main condition and an Except-When
condition, as shown in Figure 10.2. We will consider all the possible nine positions of the
example with respect to the bounds of these conditions. Notice that the presented
methods will similarly apply when there is no Except-When condition or more than
one Except-When condition.
We will first illustrate rule refinement with a positive example and then we will present
the general method.
.
5
XL: Except-When Condition
ML: Main Condition
6 Plausible Lower Bound
.
Plausible Lower Bound
. 1
..
7
. 8
Universe of
.
2 9
Instances
3
Figure 10.2. Partially learned condition and various positions of a new example.
13:54:13,
.011
Positive example that satisfies the upper bound but not the lower bound
Figure 10.4. Minimal generalization of the rule’s plausible lower bound condition.
13:54:13,
.011
10.1. Incremental Rule Refinement 299
Let R be a plausible version space rule, U its main plausible upper bound condition, L its main plausible
lower bound condition, and P a positive example of R covered by U and not covered by L.
end
Determine Pg, the minimal generalization of the example P (see Section 9.9).
Return the generalized rule R with the updated conditions U and L, and Pg in the list of
generalized examples of R.
Refined rule
Rule’s condition
13:54:13,
.011
300 Chapter 10. Rule Refinement
problem, however, is trivial in Disciple-EBR because both the plausible lower bound
condition and the condition corresponding to the example have exactly the same struc-
ture, and the corresponding variables have the same names, as shown in Figure 10.5. This
is a direct consequence of the fact that the example is generated from the plausible upper
bound condition of the rule.
Based on this failure explanation, the agent generates an Except-When plausible version
space condition by applying the method described in Sections 9.6 and 9.7. First it reformu-
lates the explanation as a specific condition by using the corresponding variables from the
rule, or by generating new variables (see also the bottom-left part of Figure 10.7):
Then the agent generates a plausible version space by determining maximal and minimal
generalizations of the preceding condition. Finally, the agent adds it to the rule as an
Except-When plausible version space condition, as shown in the bottom-right part of
Figure 10.7. The Except-When condition should not be satisfied to apply the rule. Thus, in
order to conclude that a professor will stay on the faculty for the duration of the disserta-
tion of a student, the professor should have a long-term position (the main condition) and
it should not be the case that the professor plans to retire from the university (the Except-
When condition).
Figure 10.8 shows the further refinement of the rule with an additional negative
example. This example satisfies the rule in Figure 10.7. Indeed, Jane Austin has a long-term
position and she does not plan to retire from George Mason University. Nevertheless, the
expert rejects the reasoning represented by this example because Jane Austin plans to
move to Indiana University. Therefore, she will not stay on the faculty of George Mason
University for the duration of the dissertation of Bob Sharp.
13:54:13,
.011
10.1. Incremental Rule Refinement 301
MU XU XU
MU
2. If E is covered by MU, is not covered by ML, and is not
covered by XU (case 2), then minimally generalize ML ML
XL
ML
XL
MU XU MU XU
3. If E is not covered by MU (cases 3 and 5), or if E is 5 +
covered by XL (cases 5, 6, and 7), then keep E as a ML 6 XL
ML + XL
as a positive exception. 8
ML
XL
13:54:13,
.011
Negative example generated by the rule Rule that generated the negative example
.011
13:54:13,
Failure explanation
Dan Smith plans to retire from George Mason University
+
-
13:54:13,
Failure explanation
Dan Smith plans to rere from George Mason University
Rewrite as
Specific Except-When condition
?O1 is Dan Smith
plans to retire from ?O2
?O2 is George Mason University
Negative example
.011
13:54:13,
+
-
Failure explanation -
Rewrite as
Specific Except-When condition
?O1 is Jane Austin
Max generalization
plans to move to ?O5 Min generalization
?O5 is Indiana University
Notice that the agent has introduced a new variable ?O5 because Indiana University does
not correspond to any entity from the previous form of the rule (as opposed to Jane Austin
who corresponds to ?O1).
Then the agent generates a plausible version space by determining maximal and
minimal generalizations of the preceding condition. Finally, the agent adds it to the rule
as an additional Except-When plausible version space condition, as shown at the bottom-
right part of Figure 10.8.
Then the agent developed a partial reasoning tree, but it was unable to assess one of the
subhypotheses:
Jill Knox will stay on the faculty of George Mason University for the duration of the
dissertation of Peter Jones.
Let R be a plausible version space rule, N an instance of R rejected by the expert as an incorrect
reasoning step (a negative example of R), and EX an explanation of why N is incorrect (a failure
explanation).
(1) Reformulation of the Failure Explanation
Generate a new variable for each instance and each constant (i.e., number, string, or symbolic
probability) that appears in the failure explanation EX but does not appears in the negative
example N. Use the new variables and the rule’s variables to reformulate the failure explanation EX
as an instance I of the concept EC representing an Except-When condition of the rule R.
(2) Analogy-based Generalizations of the Failure Explanation
Generate the plausible upper bound XU of the concept EC as the maximal generalization of I in the
context of the agent’s ontology.
Generate the plausible lower bound LU of the concept EC as the minimal generalization of I that
does not contain any specific instance.
(3) Rule Refinement with an Except-When Plausible Version Space Condition
Add an Except-When plausible version space condition (XU, LU) to the existing conditions of the
rule R. This condition should not be satisfied for the rule to be applicable in a given situation.
13:54:13,
.011
306 Chapter 10. Rule Refinement
Therefore, the expert defined the reduction of this hypothesis, which includes its assess-
ment, as shown at the bottom of Figure 10.9.
Based on this example, the agent learned a general rule, as illustrated in the right-hand
side of Figure 10.9 and as discussed in Chapter 9. The rule is shown in Figure 10.10.
1. Solving
Agent applies
learned rules to
solve new
problems
3. Learning
2. Modeling
Agent learns
a new rule
13:54:13,
.011
10.1. Incremental Rule Refinement 307
This and the other learned rules enabled the agent to develop the reasoning tree from
Figure 10.11 for assessing the following hypothesis:
However, the expert rejected the bottom reasoning step as incorrect. Indeed, the correct
answer to the question, “Is Bill Bones likely stay on the faculty of George Mason University
for the duration of the PhD dissertation of June Allison?” is, “No,” not, “Yes,” because there
is no support for Bill Bones getting tenure.
The user–agent interaction during example understanding is illustrated in Figure 10.12.
The agent identified an entity in the example (the symbolic probability “no support”) that
would enable it to specialize the upper bound of the main condition of the rule to no
longer cover the negative example. Therefore, it proposed the failure explanation shown at
the right-hand side of Figure 10.12:
The expert accepted this explanation by clicking on OK, and the rule was automatically
specialized as indicated in Figure 10.13. More precisely, the upper bound of the main
condition for the variable ?Sl1 was minimally specialized from the interval [no support –
certain] to the interval [likely – certain], in order to no longer cover the value no support,
while continuing to cover the interval representing the lower bound, which is [almost
certain – almost certain].
3. Solving
Agent applies
learned rules
5. Refinement
to solve new
problems
Agent refines
rule with
negative example
4. Critiquing
Incorrect because + -
of the “no support”
probability
13:54:13,
.011
308 Chapter 10. Rule Refinement
Specialized rule
Failure explanation
Min specialization
Figure 10.13. Specialization of the upper bound of a plausible version space condition.
13:54:13,
.011
10.2. Learning with an Evolving Ontology 309
Let R be a plausible version space rule, U the plausible upper bound of the main condition, L the
plausible lower bound of the main condition, N a negative example covered by U and not covered
by L, and C an entity from N that is blamed for the failure.
1. Let ?X be the variable from the rule’s conditions that corresponds to the blamed
entity C.
Let UX and LX be the classes of ?X in the two bounds.
If each concept from LX covers C
then Continue with step 2.
else Continue with step 3.
2. The rule cannot be specialized to uncover the current negative example.
The negative example N is associated with the rule as a negative exception.
Return the rule R.
3. There are concepts in LX that do not cover C. The rule can be specialized to uncover
N by specializing UX, which is known to be more general than C.
3.1. Remove from LX any element that covers C.
3.2. Repeat for each element ui of UX that covers C
Remove ui from UX.
Add to UX all minimal specializations of ui that do not cover C and are more general than
or at least as general as a concept from LX.
Remove from UX all the concepts that are less general than or as general as other
concepts from UX.
end
4. Return the specialized rule R.
13:54:13,
.011
310 Chapter 10. Rule Refinement
exception. _
5 _
need to be refined because the example is correctly XL _ XL
ML 6 ML
classified as negative by the current rule. If N is not
7 _
covered by MU, is not covered by XL, and is covered
by XU (case 4), then minimally generalize XL to
_
cover N and remain less general than XU. 3
MU XU MU XU
4. If N is covered by ML and by XU, but it is not covered
by XL (case 8), or N is covered by MU and by XU, but ML
XL
ML
XL
and their explanations in the context of the updated ontology. This is, in fact, the reason
why the generalized examples are maintained with each rule, as discussed in Section 9.9.
The rule regeneration problem is presented in Table 10.7. Notice that not all the
changes of an ontology lead to changes in the previously learned rules. For example,
adding a new concept that has no instance, or adding a new instance, will not affect the
previously learned rules. Also, renaming a concept or a feature in the ontology automatic-
ally renames it in the learned rules, and no additional adaptation is necessary.
13:54:13,
.011
10.2. Learning with an Evolving Ontology 311
the ontology a version, and each time a rule is learned or refined, it associates the version
of the ontology with the rule. The version of the ontology is incremented each time a
significant change – that is, a change that may affect the conditions of the previously
learned rules – is made. Then, before using a rule in problem solving, the agent checks
the rule’s ontology version with the current version of the ontology. If the versions are
the same, the rule is up to date and can be used. Otherwise, the agent regenerates the
rule based on the current ontology and also updates the rule’s ontology version to the
current version of the ontology. The on-demand rule regeneration method is presented
in Table 10.8.
GIVEN
A plausible version space reduction rule R corresponding to a version v of the ontology
Minimal generalizations of the examples and explanations from which the rule R was learned, in
the context of the version v of the ontology
An updated ontology with a new version v’
DETERMINE
An updated rule that corresponds to the same generalized examples, but in the context of the
new version v’ of the ontology
Updated minimal generalizations of the specific examples from the current scenario, if any
object
educational
expert employee student organization faculty position Computer Science
13:54:13,
.011
312 Chapter 10. Rule Refinement
We will first illustrate the regeneration of the rules presented in the previous sections
and then provide the general regeneration method.
Let R be a plausible version space rule, and O the current ontology with version v.
If R’s ontology version is v, the same as the version of the current ontology O
then Return R (no regeneration is needed).
else Regenerate rule R (see Table 10.9).
Set R’s ontology version to v.
Return R
Min Max
generalization generalization
Figure 10.15. The updated conditions of the rule from Figure 10.5 in the context of the ontology
from Figure 10.14.
13:54:13,
.011
10.2. Learning with an Evolving Ontology 313
in Section 9.9. The top part of Figure 10.15 shows the updated bounds of the rule, in the
context of the updated ontology from Figure 10.14. The new plausible lower bound condi-
tion is the minimal generalization of the generalized examples, in the context of the updated
ontology. Similarly, the new plausible upper bound condition is the maximal generalization
of the generalized examples in the context of the updated ontology.
Notice that in the updated plausible lower bound condition (shown in the upper-left
part of Figure 10.15), ?O2 is now a professor, instead of a professor or PhD advisor. Indeed,
note the following expression from the first generalized example shown in the lower-left
part of Figure 10.15:
Based on the updated ontology, where associate professor is a subconcept of PhD advisor,
this expression is now equivalent to the following:
Similarly, note the following expression from the second generalized example shown in
the lower-right part of Figure 10.15:
Because full professor is now a subconcept of PhD advisor, this expression is now equivalent
to the following:
Then the minimal generalization of the expressions [10.2] and [10.4] is the following
expression because professor is the minimal generalization of associate professor and full
professor:
Also, in the updated plausible upper bound condition (shown in the upper-right part of
Figure 10.15), ?O2 is an expert instead of a person, because this is the maximal generaliza-
tion of associate professor and full professor, which is included into the domain of the is
expert in feature of ?O2, which is expert.
Let us now consider the rule from the right-hand part of Figure 10.8 (p. 304). The
minimal generalizations of the examples from which this rule was learned are shown
under the updated conditions in Figure 10.16. They were determined based on the
ontology from Figure 9.7. You remember that this rule was learned from one positive
example and two negative examples. However, each of the negative examples was used as
a positive example of an Except-When plausible version space condition. That is why each
of the generalized examples in Figure 10.16 has a positive example.
13:54:13,
.011
Except-When Condition 1
Plausible Lower Plausible Upper
Bound Condition (LB) Bound Condition (UB)
?O1 is full professor ?O1 is employee
plans to retire from ?O2 plans to retire from ?O2
?O2 is university ?O2 is organization
Min Max
Main Condition generalization generalization
Plausible Lower Plausible Upper
Bound Condition (LB) Bound Condition (UB) Example generalization
?O1 is full professor
?O1 is associate professor ?O1 is employee
is PhD advisor
has as position ?O4 has as position ?O4
plans to retire from ?O2
?O2 is university ?O2 is actor
?O2 is university
?O3 is PhD student ?O3 is actor
?O4 is tenured position ?O4 is long-term faculty position Covered positive examples: 1
?SI1 is in [almost certain - almost certain] ?SI1 is in [almost certain - almost certain] Covered negative examples: 0
Plausible Upper
Example Generalization Bound Condition (LB) Bound Condition (UB)
?O1 is associate professor ?O1 is full professor ?O1 is person
plans to move to ?O5 plans to move to ?O5
13:54:13,
is PhD advisor
has as position ?O4 ?O5 is university ?O5 is organization
?O2 is university
?O3 is PhD student Min Max
?O4 is tenured position generalization generalization
?SI1 is exactly almost certain Example Generalization
Covered positive examples: 1 ?O1 is full professor
Covered negative examples: 0 is PhD advisor
plans to move to ?O5
?O5 is university
The lower and upper bounds of the rule in Figure 10.8 (p. 304) were updated by
computing the minimal and maximal generalizations of these generalized examples in
the context of the updated ontology from Figure 10.14. Let us first consider the updated
version space of the main condition. Notice that in the lower bound, ?O1 is now associate
professor, instead of PhD advisor or associate professor (see Figure 10.8). Also, in the upper
bound, ?O1 is employee instead of person. The version spaces of the Except-When condi-
tions have also been updated. In the first Except-When condition, ?O1 is now full professor
in the lower bound and employee in the upper bound, instead of PhD advisor or full
professor and person, respectively. Similarly, in the second Except-When condition, ?O1
is now full professor in the lower bound, instead of PhD advisor or full professor.
Finally, let us consider the rule from Figure 10.13 (p. 308), which was learned based on
the ontology from Figure 9.7, as discussed in Section 10.1.4. The minimal generalizations
of the positive and negative examples from which this rule was learned are shown at the
bottom of Figure 10.17. They were determined based on the ontology from Figure 9.7.
Notice that these generalized examples include all the explanations from which the rule
was learned. In particular, the explanation that fixed the value of “tenure-track position” is
represented as “?O4 is exactly tenure-track position” and that which excluded the value “no
support” is represented as “?SI1 is-not no support in main condition.”
Min Max
generalization generalization
Figure 10.17. The updated conditions of the rule from Figure 10.13 in the context of the ontology
from Figure 10.14.
13:54:13,
.011
316 Chapter 10. Rule Refinement
The new lower and upper bounds of the rule in the context of the updated ontology
from Figure 10.14 are shown at the top of Figure 10.17. Notice that in this case, the
regenerated rule is actually the same with the previous rule. The changes made to the
ontology did not affect this rule. However, the agent did recompute it because it cannot
know, a priori, whether the rule will be changed or not. The only change made to the rule
is to register that it was determined based on the new version of the ontology.
Hypothesis refinement is performed using methods that are very similar to the preceding
methods for rule refinement, as briefly summarized in this section.
Remember that general hypotheses are automatically learned as a byproduct of reduc-
tion rule learning, if they have not been previously learned. When a reduction rule is
refined with a positive example, each included hypothesis is also automatically refined
with its corresponding positive example. Indeed, when you say that a reduction is correct,
you are also implicitly saying that each of the included hypotheses is correct.
However, when a reduction rule is refined with a negative example, the hypotheses
are not affected. Indeed, a negative reduction example means that the corresponding
reduction is not correct, not that the any of the involved hypotheses is incorrect. For this
reason, an explanation of why a specific reduction is incorrect does not automatically
apply to the hypotheses from that reduction. If you want to say that a specific hypothesis
is incorrect, you have to select it and click on the Incorrect problem button. Then the
Let O be the current ontology with version v, and R a plausible version space rule with a different
version.
1. Recompute the formal parameters P for rule R (see Table 10.10).
2. Refresh examples for rule R (see Table 10.11).
3. If R is no longer valid (i.e., the rule has no longer any generalized positive example)
then Return null
4. Recompute plausible version space for rule R (see Table 10.12).
5. Repeat for each specific example EX of R
If the upper bound of the main plausible version space condition (PVS) of R does not cover
EX and EX is a positive example
then make EX a positive exception.
If the upper bound of PVS of R does cover EX and EX is a negative example
then make EX a negative exception.
end
6. Return R
13:54:13,
.011
10.4. Characterization of Rule Refinement 317
selected hypothesis will be refined basically using the same methods as those for rule
refinement.
Just as a refined rule, a refined hypothesis may include, in addition to the plausible
version space of the main condition, one or several plausible version spaces of Except-When
conditions, and generalized positive and negative examples. When the ontology is changed,
the hypotheses can be automatically regenerated based on the associated generalized
examples. They are actually regenerated when the corresponding rules are regenerated.
The presented rule learning and refinement methods have the following characteristics:
13:54:13,
.011
318 Chapter 10. Rule Refinement
Table 10.12 Method for Recomputing the Plausible Version Space of a Rule
Let R be a plausible version space rule having the main condition M and the list of the Except-
When conditions LX.
1. Let MP be the list of the parameters from the main condition M.
Let IP be the list of parameters from the natural language part of the rule R, referred to as
informal parameters.
MP = IP
2. Repeat for each explanation of a positive example EP in the rule R
MP = MP [ the new parameters from EP
end
3. Let LGS be the list of the generalized examples that have at least one specific positive
example.
Let LEP be the list of the explanations EP of the positive examples in the rule R.
Create the multivariable condition MC based on MP, LGS, LEP, IP (see Table 10.13).
4. Let LX be the list of Except-When conditions.
LX = [ ]
5. Repeat for each group EEp of Except-When explanations in R.
Let EP be the list of parameters used in EEp.
Let EX be the list of the generalized negative examples associated with EEp that have
at least one specific negative example.
Create the multivariable condition XC based on MP = EP, LGS = EX, LEP = EEp, and
IP = ∅ (see Table 10.13).
LX = LX [ XC
end
6. Return R
13:54:13,
.011
10.5. Hands On: Rule Refinement 319
Let R be a plausible version space rule, MP be the list of the parameters from the main condition,
LGS be the list of generalized examples that have at least one specific positive example, LEP be the
list of the explanations EP of the positive examples in the rule R, and IP be the list of informal
parameters of R.
1. Let A be the list of the generalized explanations fragments (such as “?Oi is interested in
?Oj”) from the generalized examples of R.
Compute A based on MP and LEP.
2. Let D be the domains of the variables from MP, each domain consisting of a lower bound
and an upper bound.
D=[]
3. Repeat for each parameter ?Oi in MP
Determine the list GE = {ge1, . . ., geg} of the concepts from LGS corresponding to ?Oi
(e.g., “assistant professor” from “?Oi is assistant professor”).
Determine the list PC = {PC1, . . ., PCp} of the concepts to which ?Oi must belong,
corresponding to positively constraining explanations
(e.g., “?Oi is exactly tenured position”).
Determine the list NC = {NC1, . . ., NCn} of the concepts to which ?Oi must not belong,
corresponding to negatively constraining explanations (e.g., “Sl1 is-not no support”).
Create the domains Do from the examples GE and the constraints PC and NC
(see Table 10.14).
D = D [ Do
end
4. Create the multivariable condition structure from MP, D, and A.
Return MVC
There are two rule refinement case studies, a shorter one and a longer one. In the shorter
case study, you will guide the agent to refine the rule from Figure 9.8 (p. 259), as discussed
in the previous sections. In the longer case study, you will guide the agent to refine several
rules, including the rule from Figure 9.8. You may perform the short case study, or the long
one, or both of them.
Start Disciple-EBR, select the case study knowledge base (either “13-Rule-Refinement-
short/Scen” or “13-Rule-Refinement/Scen”), and proceed as indicated in the instructions
at the bottom of the opened window.
The following are the basic operations for rule refinement, as well as additional operations
that are useful for knowledge base refinement, such as changing a generated reasoning step
into a modeling step, visualizing a rule with the Rule Editor, and deleting a rule.
13:54:13,
.011
320 Chapter 10. Rule Refinement
Table 10.14 Method for Creating Domains from Examples and Constraints
13:54:13,
.011
10.6. Guidelines for Rule Refinement 321
13:54:13,
.011
322 Chapter 10. Rule Refinement
Refine the learned reduction rules by assessing hypotheses that are similar to the ones
considered in the previous assignments.
10.3. Consider the version space from Figure 10.19. In light of the refinement strategies
studied in this chapter, how will the plausible version space be changed as a result
of a new negative example labeled 1? Draw the new version space(s).
10.4. Consider the version space from Figure 10.20. In light of the refinement strategies
studied in this chapter, what are three alternative ways in which this version space
may be changed as a result of the negative example 2?
. 1
Universe of
Instances
Figure 10.19. Version space and a negative example covered by the lower bound of the main
condition.
. 2
Universe of
Instances
Figure 10.20. Version space and a negative example covered by the upper bound of the main
condition.
13:54:13,
.011
10.8. Review Questions 323
Positive Example 1
We need to
Determine a strategic center of gravity for a
member of Allied Forces 1943.
Explanation
Which is a member of Allied Forces 1943?
Allied Forces 1943 has as member US 1943
US 1943
Therefore we need to
Determine a strategic center of gravity for US 1943.
object
multistate multistate
alliance coalition
Figure 10.22. Ontology fragment from the center of gravity analysis domain. Dotted links
indicate instance of relationships while continuous unnamed links indicate subconcept of
relationships.
10.5. (a) Consider the example and its explanation from Figure 10.21. What rule will
be learned from them, assuming the ontology from Figure 10.22, where all
the instances are considered specific instances?
(b) Consider the additional positive example from Figure 10.23. Indicate the
refined rule.
(c) Consider the negative example, its failure explanation, and the additional
ontological knowledge from Figure 10.24. Indicate the refined rule.
13:54:13,
.011
324 Chapter 10. Rule Refinement
Positive Example 2
We need to
Determine a strategic center of gravity for a member of
European Axis 1943.
Germany 1943
Therefore we need to
Determine a strategic center of gravity for Germany 1943.
Figure 10.24. Negative example, failure explanation, and additional ontology fragment.
is a major generator of
explains
10.6. Consider the example and its explanation shown in Figure 10.25. Find the plaus-
ible version space rule that will be learned based on the ontology fragments from
Figures 10.26, 10.27, and 10.28, where all the instances are defined as generic
instances.
13:54:13,
.011
10.8. Review Questions 325
raw material
industrial transportation
industrial farm implement center factor
authority industry
strategic raw
material industrial
farm implement capacity
industry of Italy 1943 transportation transportation
center network or system
oil chromium copper and
bauxite of Germany 1943
is critical to the production of
Figure 10.26. Ontology of economic factors. Dotted links indicate instance of relationships while
continuous unnamed links indicate subconcept of relationships.
object
… …
force
group
multigroup force opposing force multistate force single-state force single-group force
Figure 10.27. An ontology of forces. Dotted links indicate instance of relationships while continuous
unnamed links indicate subconcept of relationships.
10.7. Minimally generalize the rule from the left side of Figure 10.29 in order to cover
the positive example from the right side of Figure 10.29, considering the back-
ground knowledge from Figures 10.26, 10.27, and 10.28.
13:54:13,
.011
326 Chapter 10. Rule Refinement
object
… …
resource or infrastructure element
resource
product
strategic
raw material
war material and fuel war material and transports farm implements
Figure 10.28. An ontology of resources. Dotted links indicate instance of relationships while con-
tinuous unnamed links indicate subconcept of relationships.
Rule
10.8. Minimally specialize the rule from the left side of Figure 10.30, in order to cover
the positive example from the right side of Figure 10.30, considering the back-
ground knowledge from Figures 10.26, 10.27, and 10.28.
13:54:13,
.011
10.8. Review Questions 327
Table 10.15 Rule Refinement with Learning Agent versus Rule Refinement by
Knowledge Engineer
Description (highlighting
differences and similarities)
Strengths
Weaknesses
Rule
IF
Identify each strategic COG candidate with respect Negative example that satisfies the upper bound
to the industrial civilization of ?O1.
IF the task to accomplish is
Plausible Upper Bound Condition Identify each strategic COG candidate with
?O1 is force respect to the industrial civilization of Italy 1943.
has as industrial factor ?O2
?O2 is industrial factor THEN accomplish the task
is a major generator of ?O3 farm implement industry of Italy 1943 is a
?O3 is product strategic COG candidate for Italy 1943.
10.9. Consider the problem reduction step and its explanation from Figure 10.25, as
well as the ontology of economic factors from Figure 10.26. Show the correspond-
ing analogy criterion generated by a cautious learner, and an analogous reduction
made by that cautious learner.
10.11. Compare the learning-based rule refinement process discussed in this chapter
with the traditional knowledge acquisition approach discussed in Section 3.1.4 by
filling in Table 10.15. Identify similarities and differences and justify the relative
strengths and weaknesses of the two approaches.
13:54:13,
.011
328 Chapter 10. Rule Refinement
I2
I3
I1
I4
10.12. Consider the partially learned concept and the four instances from Figure 10.31.
Order the instances by the plausibility of being positive examples of this concept
and justify the ordering.
13:54:13,
.011
11 Abstraction of Reasoning
Up until this point, the methodology for developing intelligent agents has encouraged
the expert to be very explicit and detailed, to provide clear descriptions of the
hypotheses (or problems), and to formulate detailed questions and answers that guide
the reduction of hypotheses (or problems) to subhypotheses (or subproblems). This is
important because it facilitates a clear and correct logic and the learning of the
reasoning rules.
The developed agents can solve complex problems through the generation of
reasoning trees that can be very large, with hundreds or even thousands of nodes.
In such cases, browsing and understanding these reasoning trees become a
challenge.
In this section, we will discuss an approach to abstract a large reasoning tree that
involves abstracting both hypotheses/problems and subtrees. The goal is to obtain a
simpler representation where the abstract tree has fewer nodes and each node has a
simpler description. At the same time, however, we want to maintain the correspondence
between the abstract tree and the original tree, in order to have access to the full descrip-
tions of the nodes.
329
13:54:15,
.012
330 Chapter 11. Abstraction of Reasoning
Figure 11.1. As discussed in Section 9.10, from each specific hypothesis Disciple-EBR
automatically learns a general hypothesis with applicability conditions, which can be
further refined. The bottom-left part of Figure 11.1 shows the learned hypothesis whose
upper bound condition was further refined. Let us assume that the expert abstracts the
specific hypothesis to “John Doe will stay on the faculty” (see the upper-right part of
Figure 11.1). As a result, the agent automatically learns the abstraction pattern “?O1 will
stay on the faculty,” which corresponds to the abstraction of the learned hypothesis (see
the lower-right part of Figure 11.1).
The data structures and their relationships illustrated in Figure 11.1 have several
consequences. First, all the instances of the learned hypothesis will have the same
abstraction pattern. For example, the instance, “Dan Barker will stay on the faculty of
University of Virginia for the duration of the PhD dissertation of Sandra Lee,” will automatic-
ally be abstracted to “Dan Barker will stay on the faculty.”
Conversely, if you change the abstraction of a specific hypothesis, the pattern of the
learned abstraction will change accordingly. For example, if you now change the abstrac-
tion of “Dan Barker will stay on the faculty of University of Virginia for the duration of the
PhD dissertation of Sandra Lee,” to “Dan Barker will stay with University of Virginia,” then the
abstraction of the learned hypothesis in Figure 11.1 is automatically changed to “?O1 will
stay with ?O2.” Therefore, the abstraction of the specific hypothesis from Figure 11.1 is
automatically changed to “John Doe will stay with George Mason University.”
Thus, although at the beginning of this section we have provided several examples
of abstracting a hypothesis, each hypothesis may have only one abstraction at a time.
Notice, however, that the same abstract pattern may be an abstraction of different
learned hypotheses. Consequently, different specific hypotheses may also have the
same abstraction.
Finally, notice that the agent automatically abstracts the probabilistic solution of a
hypothesis to the actual probability. Thus, the solution, “It is almost certain that John Doe
will stay on the faculty of George Mason University for the duration of the PhD dissertation of
Bob Sharp,” is automatically abstracted to “almost certain.”
Specific hypothesis
John Doe will stay on the faculty of George Mason Abstract hypothesis
has as
University for the duration of the PhD dissertation of John Doe will stay on the faculty
abstraction
Bob Sharp.
13:54:15,
.012
11.4. Hands On: Abstraction of Reasoning 331
One simple way of abstracting a reasoning step is to abstract all the hypotheses and to
eliminate the question/answer pair, as illustrated in Figure 11.2. The right-hand side of
Figure 11.2 shows the complete description of a reasoning step. The left-hand side shows
only the abstraction of the top hypothesis and the abstractions of its subhypotheses.
Notice that the meaning of an abstract subhypothesis is to be understood in the context of its
parent abstract hypothesis, and therefore it can be shorter. For example, “reasons” is under-
stood as “United States has reasons to be a global leader in wind power.” Notice also that the
abstraction of a hypothesis also includes the abstraction of its assessment, or “unknown” (if an
assessment has not yet been made). The assessments (or solutions) may also be made visible in
the detailed reasoning tree by clicking on [SHOW SOLUTIONS] at the top of the window.
One may also abstract an entire subtree, not just a reduction step. The right-hand side
of Figure 11.3 shows the reasoning tree that is abstracted in the left-hand side of the figure.
In particular, the abstract tree consists of the abstraction of the top hypothesis and the
abstractions of the leaf hypotheses from the detailed tree.
Once a reasoning tree has been abstracted, it can be browsed as illustrated in Figure 11.4.
The left-hand side of Figure 11.4 shows the entire abstract tree. Each node in this tree is an
abstraction of a hypothesis and of its assessment (or “unknown” if an assessment has not
yet been made). The user can browse this tree by expanding or collapsing its nodes
through clicking on the + or – nodes. Once you click on a node in the abstract tree, such
as “desire: unknown,” the detailed description of that node is shown in the right-hand side.
The objective of this case study is to learn how to use Disciple-EBR to abstract a reasoning
tree. More specifically, you will learn how to:
13:54:15,
.012
.012
13:54:15,
This case study will guide you through the process of abstracting the analysis of the
hypothesis, “John Doe would be a good PhD advisor for Bob Sharp,” with which you have
practiced in the previous case studies.
Start Disciple-EBR, select the case study knowledge base “14-Abstractions/Scen” and
proceed as indicated in the instructions at the bottom of the opened window.
This case study illustrates the following operations for the abstraction of reasoning:
Figure 11.5. Understanding the meaning of an abstracted hypothesis in the context of its upper
hypotheses.
13:54:15,
.012
11.7. Review Questions 335
which contributes to the “desire” component for the hypothesis “United States global
leader in wind power.”
11.3. Consider that the top hypothesis is “John Doe would be a good PhD advisor for Bob
Sharp,” and that “John Doe would be a good PhD advisor with respect to the
professional reputation criterion” is one of its subhypotheses. Indicate several pos-
sible abstractions of this subhypothesis, where each such abstraction is to be
understood in the context of the top hypothesis.
11.4. Consider the hypothesis, “John Doe would be a good PhD advisor with respect to
the professional reputation criterion,” and its subhypothesis, “John Doe would be a
good PhD advisor with respect to the peer opinion criterion.” Indicate abstractions of
these two hypotheses, where these abstractions are to be understood in the context
of the top hypothesis “John Doe would be a good PhD advisor for Bob Sharp.”
11.5. Consider the detailed reasoning tree from Figure 11.6. Provide a corresponding
abstract tree, thinking carefully about how to best abstract each of the hypotheses.
11.6. Consider the detailed reasoning tree from Figure 11.7. Provide a corresponding
abstract tree, thinking carefully about how to best abstract each of the hypotheses.
13:54:15,
.012
.012
13:54:15,
12.1 INTRODUCTION
The agent building theory, methodology and tool presented in this book evolved over
many years, with developments presented in numerous papers and a series of PhD
theses (Tecuci, 1988; Dybala, 1996; Hieb, 1996; Keeling, 1998; Boicu 2002; Bowman,
2002; Boicu, 2006; Le, 2008; Marcu, 2009). Although this book has emphasized the
development of Disciple agents for evidence-based reasoning applications, the learning
agent theory and technology are applicable and have been applied to a wide range of
knowledge-intensive tasks, such as those discussed in Section 1.6.2.
A previous book (Tecuci, 1998) presented the status of this work at that time and
included descriptions of Disciple agents for designing plans for loudspeaker manufactur-
ing, for assessing students’ higher-order thinking skills in history or in statistics, for
configuring computer systems, and for representing a virtual armored company com-
mander in distributed interactive simulations.
More recent Disciple agents and their applications include Disciple-WA, an agent for
the development of military engineering plans; Disciple-COA, for the critiquing of military
courses of action; Disciple-COG, for military center of gravity determination; Disciple
agents representing virtual experts for collaborative emergency response planning;
Disciple-LTA, for intelligence analysis; Disciple-FS, for regulatory compliance in financial
services industries; Disciple-WB, for assessing the believability of websites; and Disciple
agents for modeling the behavior of violent extremists.
The following sections present four of these agents and their applications. While all
illustrate the general agent development approach discussed in this book, they differ in
some of their capabilities and appearance, each reflecting a different stage or trajectory in
the development of the Disciple approach.
338
13:54:21,
.013
12.2. Disciple-WA: Military Engineering Planning 339
A description of the military unit that needs to work around some damage (e.g., an
armored tank brigade or a supply company)
A description of the damage (e.g., a span of the bridge is dropped and the area is
mined) and of the terrain (e.g., the soil type; the slopes of the riverbanks; the river’s
speed, depth, and width)
A detailed description of the resources in the area that could be used to repair the
damage. This includes a description of the engineering assets of the military unit that
has to work around the damage, as well as the descriptions of other military units in
the area that could provide additional resources
The output of the agent consists of the most likely repair strategies, each described in
terms of three elements:
Site 103:cross-section
Damage 200: destroyed bridge
Site 100
Left bank
Right bank
Site 105
Site 107
River bed
Site 106
13:54:21,
.013
340 Chapter 12. Disciple Agents
Initial task: 25 m
WORKAROUND-DAMAGE 17 m
FOR-DAMAGE DAMAGE200 UNIT 91010
BY-INTERDICTED-UNIT UNIT91010 AVLB 70
Detailed plan:
S1 OBTAIN-OPERATIONAL-CONTROL-FROM-CORPS S7 NARROW-GAP-BY-FILLING-WITH-BANK
OF-UNIT UNIT202 FOR-GAP SITE103
BY-UNIT UNIT91010 FOR-BR-DESIGN AVLB70
MIN-DURATION 4H:0M:0S MIN-DURATION 5H:19M:44S
EXPECTED-DURATION 6H:0M:0S EXPECTED-DURATION 6H:7M:42S
TIME-CONSTRAINTS NONE RESOURCES-REQUIRED BULLDOZER-UNIT201
TIME-CONSTRAINTS AFTER S6
S2 MOVE-UNIT
FOR-UNIT UNIT202 S8 EMPLACE-AVLB
FROM-LOCATION SITE0 FOR-BR-DESIGN AVLB70
TO-LOCATION SITE100 MIN-DURATION 5M:0S
MIN-DURATION 1H:8M:14S EXPECTED-DURATION 10M:0S
EXPECTED-DURATION 1H:8M:14S RESOURCES-REQUIRED AVLB-UNIT202
TIME-CONSTRAINTS AFTER S1 TIME-CONSTRAINTS AFTER S3, S7
S3 REPORT-OBTAINED-EQUIPMENT S9 REPORT-EMPLACED-FIXED-BRIDGE
FOR-EQ-SET AVLB-UNIT202 FOR-MIL-BRIDGE AVLB-UNIT202
MIN-DURATION 0S MIN-DURATION 0S
EXPECTED-DURATION 0S EXPECTED-DURATION 0S
TIME-CONSTRAINTS AFTER S2 TIME-CONSTRAINTS AFTER S8
S1 S2 S3
S8 S9 S10 S11 S12
S4 S5 S6 S7
13:54:21,
.013
12.2. Disciple-WA: Military Engineering Planning 341
The detailed plan is shown under its summary and consists of twelve elementary
actions. UNIT91010 has to obtain operational control of UNIT202, which has the AVLB.
Then UNIT202 has to come to the site of the destroyed bridge. Also, UNIT91010 has to
obtain operational control of UNIT201, which has a bulldozer. Then UNIT201 will have
to move to the site of the destroyed bridge and to narrow the river gap from 25 meters to
17 meters. These actions can take place in parallel with the actions of bringing UNIT202 to
the bridge site, as shown at the bottom of Figure 12.2. Then the AVLB bridge is emplaced,
and the bulldozer moves over the bridge to clear the other side of the river in order to
restore the flow of traffic. This plan was generated by successively reducing the
WORKAROUND-DAMAGE task to simpler subtasks, until this task was reduced to the
twelve tasks shown in Figure 12.2.
The process of developing this agent has followed the general Disciple methodology, as
briefly discussed in the next sections.
OBTAIN-AVLB
OBTAIN-BULLDOZER-AND-NARROW-GAP
INSTALL-AVLB-OVER-NARROWED-GAP
Each of these subtasks is further reduced to elementary tasks that will constitute the
generated plan shown in Figure 12.2. For example, OBTAIN-AVLB is reduced to the
following elementary tasks:
S1 OBTAIN-OPERATIONAL-CONTROL-FROM-CORPS
S2 MOVE-UNIT
S3 REPORT-OBTAINED-EQUIPMENT
13:54:21,
.013
342 Chapter 12. Disciple Agents
WORKAROUND-OBSTACLE
BY-UNIT UNIT91010
WORKAROUND-BRIDGE-OBSTACLE
AT-LOCATION SITE100
BY-UNIT UNIT91010
WORKAROUND-UNMINED-DAMAGED-BRIDGE
AT-LOCATION SITE100
BY-UNIT UNIT91010
WORKAROUND-UNMINED-DAMAGED-BRIDGE-WITH-FIXED-BRIDGE AT-LOCATION
FOR-GAP
?O1
?O2
BY-UNIT ?O3
AT-LOCATION SITE100 Plausible upper bound Plausible lower bound
FOR-GAP SITE103 ?O1
?O2
IS
IS
BRIDGE
CROSS-SECTION
?O1
?O2
IS
IS
SITE100
SITE103
HAS-WIDTH ?N4 HAS-WIDTH ?N4
BY-UNIT UNIT91010 ?O3 IS MILITARY-UNIT ?O3 IS UNIT91010
MAX-TRACKED-MLC ?N3 MAX-TRACKED-MLC ?N3
MAX-WHEELED-MLC ?N2 MAX-WHEELED-MLC ?N2
?O4 IS AVLB-EQ ?O4 IS AVLB-EQ
What strategy can be used? CAN-BUILD ?O5
MAX-REDUCIBLE-GAP ?N5
CAN-BUILD ?O5
MAX-REDUCIBLE-GAP ?N5
MAX-GAP ?N6 MAX-GAP ?N6
Fixed bridge with gap reduction because the gap is large but can be reduced ?O5 IS AVLB70
MLC-RATING ?N1
?O5 IS AVLB70
MLC-RATING ?N1
?N1 IS-IN [0.0 150.0] ?N1 IS-IN [70.0 70.0]
?N2 IS-IN [0.0 150.0] ?N2 IS-IN [20.0 20.0]
≤ ?N1 ≤ ?N1
OBTAIN-BULLDOZER-AND-NARROW-GAP
AT-LOCATION SITE100
FOR-GAP SITE103
BY-UNIT UNIT91010
OBTAIN-AVLB INSTALL-AVLB-OVER-NARROWED-GAP
AT-LOCATION SITE100 AT-LOCATION SITE100
BY-UNIT UNIT91010 BY-UNIT UNIT91010
Since these are elementary tasks that can be directly performed by the actual units, they
have known minimum and expected durations that are specified in military manuals.
Notice that these durations are specified in the descriptions of these tasks, as shown at the
bottom of Figure 12.3.
Similarly, OBTAIN-BULLDOZER-AND-NARROW-GAP from Figure 12.3 is reduced to
the tasks S4, S5, S6, and S7, as shown in Figure 12.2. Also, INSTALL-AVLB-OVER-
NARROWED-GAP is reduced to the tasks S8, S9, S10, S11, and S12.
13:54:21,
.013
12.2. Disciple-WA: Military Engineering Planning 343
Notice that the generated plan is a partially ordered one, where some of the actions are
actually performed in parallel. Disciple-WA uses a very simple strategy to generate such
plans that does not require maintaining complex state descriptions and operators with
preconditions and effects, as other planning systems do. Instead, it generates REPORT
actions with duration 0 that mark the achievement of conditions used in ordering elemen-
tary actions generated in different parts of the planning tree. For example, the action S8
EMPLACE-AVLB from Figure 12.2 can be performed only after S3 (a REPORT action) and S7:
S3 REPORT-OBTAINED-EQUIPMENT
S7 NARROW-GAP-BY-FILLING-WITH-BANK
BRIDGE-CONCEPT TEMPORARY-TRAFFIC-LINK
MILTARY-BRIDGE
FLOATING-MILITARY-BRIDGE FIXED-MILITARY-BRIDGE
MGB-SS16 BB-TT24
MGB-SS16-11 MGB-SS16-12
13:54:21,
.013
344 Chapter 12. Disciple Agents
Figure 12.5 contains the descriptions of two concepts from the hierarchy in
Figure 12.4, AVLB and AVLB70. An AVLB is a subclass of fixed military bridge that
has additional features. AVLB70 is a subclass of AVLB bridge. Each such concept
inherits all of the features of its superconcepts. Therefore, all the features of AVLB
are also features of AVLB70.
The features are defined in the same way as the concepts, in terms of more general
features. Figure 12.6, for instance, presents a fragment of the feature hierarchy. Two
important characteristics of any feature are its domain (the set of objects that could have
this feature) and its range (the set of possible values of the feature). The features may also
specify functions for computing their values.
Figure 12.5. Descriptions of two concepts from the hierarchy in Figure 12.4.
DIMENSION
SUBCLASS-OF SUBCLASS-OF
GAP-WIDTH HAS-WIDTH
SUBCLASS-OF SUBCLASS-OF
MAX-GAP MAX-REDUCIBLE-GAP
13:54:21,
.013
12.2. Disciple-WA: Military Engineering Planning 345
TASK
WORKAROUND-UNMINED-DESTROYED-BRIDGE-WITH-FIXED-BRIDGE
AT-LOCATION SITE100
FOR-GAP SITE103
BY-UNIT UNIT91010
SUBTASK
USE-FIXED-BRIDGE-WITH-GAP-REDUCTION-OVER-GAP
AT-LOCATION SITE100
FOR-GAP SITE103
BY-UNIT UNIT91010
WITH-BR-EQ AVLB-EQ
EXPLANATIONS
13:54:21,
.013
346 Chapter 12. Disciple Agents
Figure 12.8 shows the plausible version space rule learned from the task reduction
example and its explanation in Figure 12.7.
Figure 12.8. Rule learned by Disciple-WA from the example and the explanation in Figure 12.7.
13:54:21,
.013
12.2. Disciple-WA: Military Engineering Planning 347
time estimate for each workaround solution; (3) correctness of each solution step;
(4) correctness of temporal constraints among these steps; and (5) appropriateness of
engineering resources used. Scores were assigned by comparing the systems’ answers with
those of Alphatech’s human expert. Bonus points were awarded when systems gave better
answers than the expert, and these answers were used as standard for the next phase of
the evaluation.
The participating teams were not uniform in terms of prior system development and
human resources. Consequently, only one of them succeeded to enter the evaluation with
a system that had a fully developed knowledge base. The other three teams (including the
Disciple team) entered the evaluation with systems that had incompletely developed
knowledge bases. Figure 12.9 shows a plot of the overall coverage of each system against
the overall correctness of that system for each of the two phases of the evaluation.
The Disciple team entered the evaluation with a workaround agent the knowledge base
of which was covering only about 40 percent of the workaround domain (equivalent to
11,841 binary predicates). The coverage of our agent was declared prior to each release of
the testing problems, and all the problems falling within its scope were attempted and
scored. During the evaluation period, we continued to extend the knowledge base to cover
more of the initially specified domain, in addition to the developments required by the
modification phase. At the end of the two weeks of evaluation, the knowledge base of our
agent grew to cover about 80 percent of the domain (equivalent to 20,324 binary predi-
cates). This corresponds to a rate of knowledge acquisition of approximately 787 binary
predicates per day, as indicated in Figure 12.10. This result supports the claim that the
Disciple approach enables rapid acquisition of relevant problem-solving knowledge from
subject matter experts.
With respect to the quality of the generated solutions, within its scope, the Disciple-WA
agent performed at the level of the human expert. There were several cases during the
evaluation period where the Disciple-WA agent generated more correct or more complete
Correctness
Disciple Disciple
1.0 initial final
AIAI-initial ISI-final
.75 ISI-initial
TFS-final
.5
.25
TFS-initial
Figure 12.9. Evaluation results for the coverage of the problem space and the correctness of the
solutions (reprinted with permission from Eric Jones).
13:54:21,
.013
348 Chapter 12. Disciple Agents
25000
KB Size – Predicates
20000 y = 787.63x + 5521.6
15000
10000
5000
16
18
20
22
24
26
28
June 14
June 30
Days
Figure 12.10. Knowledge base development time.
solutions than those of the human expert. There were also cases where the agent gener-
ated new solutions that the human expert did not initially consider. For instance, it
generated solutions to work around a cratered road by emplacing a fixed bridge over
the crater in a way similar to emplacing a fixed bridge over a river gap. Or, in the case of
several craters, it generated solutions where some of the craters were filled while for others
fixed bridges were emplaced. These solutions were adopted by the expert and used as
standard for improving all the systems. For this reason, although the agent also made
some mistakes, the overall correctness of its solutions was practically as high as that of the
expert’s solutions. This result supports the second claim that the acquired problem-
solving knowledge is of good enough quality to ensure a high degree of correctness of
the solutions generated by the agent.
Finally, our workaround generator had also a very good performance, being able to
generate a solution in about 0.3 seconds, on a medium-power PC. This supports the third
claim, that the acquired problem-solving knowledge ensures high performance of the
problem solver.
Based on the evaluation results, the Disciple-WA agent was selected by DARPA and
Alphatech to be further extended and was integrated by Alphatech into a larger system that
supports air campaign planning by the Joint Force Air Component Commander (JFACC)
and his or her staff. The integrated system was one of the systems selected to be
demonstrated at EFX’98, the Air Force’s annual showcase of promising new technologies.
13:54:21,
.013
12.3. Disciple-COA: Course of Action Critiquing 349
and then to compare those COAs based on many factors, including the situation, the
commander’s guidance, the principles of war, and the tenets of Army operations. Then the
Commander makes the final decision on which COA will be used to generate his or her
plan based on the recommendations of the staff and his or her own experience with the
same factors considered by the staff (Jones, 1998).
The COA critiquing problem consisted of developing a knowledge-based agent that can
automatically critique COAs for ground force operations, can systematically assess
selected aspects of a COA, and can suggest repairs to it. The role of this agent is to act
as an assistant to the military Commander, helping the Commander in choosing between
several COAs under consideration for a certain mission. The agent could also help military
students learn how to develop courses of action.
The input to the COA critiquing agent consists of the description of a COA that includes
the following aspects:
1. The COA sketch, such as the one in the top part of Figure 12.11, which is a
graphical depiction of the preliminary plan being considered. It includes enough
of the high-level structure and maneuver aspects of the plan to show how the
actions of each unit fit together to accomplish the overall purpose, while omitting
much of the execution detail that will be included in the eventual operational plan.
The three primary elements included in a COA sketch are (a) control measures that
limit and control interactions between units; (b) unit graphics that depict known,
initial locations and makeup of friendly and enemy units; and (c) mission graphics
that depict actions and tasks assigned to friendly units. The COA sketch is drawn
using a palette-based sketching utility.
2. The COA statement, such as the partial one shown in the bottom part of
Figure 12.11, which clearly explains what the units in a course of action will do to
accomplish the assigned mission. This text includes a description of the mission
and the desired end state, as well as standard elements that describe purposes,
operations, tasks, forms of maneuver, units, and resources to be used in the COA.
The COA statement is expressed in a restricted but expressive subset of English.
3. Selected products of mission analysis, such as the areas of operations of the units,
avenues of approach, key terrain, unit combat power, and enemy COAs.
Based on this input, the critiquing agent has to assess various aspects of the COA, such as
its viability (i.e., its suitability, feasibility, acceptability, and completeness), its correctness
(which considers the array of forces, the scheme of maneuver, and the command and
control), and its strengths and weaknesses with respect to the principles of war and the
tenets of Army operations. The critiquing agent should also be able to clearly justify the
assessments made and to propose improvements to the COA.
Disciple-COA was developed in the DARPA’s HPKB program to solve part of the COA
critiquing problem (Cohen et al., 1998). In particular, Disciple-COA identifies the strengths
and the weaknesses of a course of action with respect to the principles of war and the tenets
of Army operations (FM 100–5, 1993). There are nine principles of war: objective, offensive,
mass, economy of force, maneuver, unity of command, security, surprise, and simplicity.
They provide general guidance for the conduct of war at the strategic, operational, and
tactical levels. For example, Table 12.1 provides the definition of the principle of mass.
The tenets of Army operations describe the characteristics of successful operations.
They are initiative, agility, depth, synchronization, and versatility.
13:54:21,
.013
350 Chapter 12. Disciple Agents
Mission: BLUE-BRIGADE2 attacks to penetrate RED-MECH-REGIMENT2 at 130600 Aug in order to enable the completion
of seize OBJ-SLAM by BLUE-ARMOR-BRIGADE1.
Close: BLUE-TASK-FORCE1, a balanced task force (MAIN-EFFORT) attacks to penetrate RED-MECH-COMPANY4, then
clears RED-TANK-COMPANY2 in order to enable the completion of seize OBJ-SLAM by BLUE-ARMOR-
BRIGADE1.
BLUE-TASK-FORCE2, a balanced task force (SUPPORTING-EFFORT1) attacks to fix RED-MECH-COMPANY1
and RED-MECH-COMPANY2 and RED-MECH-COMPANY3 in order to prevent RED-MECH-COMPANY1 and
RED-MECH-COMPANY2 and RED-MECH-COMPANY3 from interfering with conducts of the MAIN-EFFORT1,
then clears RED-MECH-COMPANY1 and RED-MECH-COMPANY2 and RED-MECH-COMPANY3 and RED-TANK-
COMPANY1.
BLUE-MECH-BATTALION1, a mechanized infantry battalion (SUPPORTING-EFFORT2) attacks to fix RED-MECH-
COMPANY5 and RED-MECH-COMPANY6 in order to prevent RED-MECH-COMPANY5 and RED-MECH-
COMPANY6 from interfering with conducts of the MAIN-EFFORT1, then clears RED-MECH-COMPANY5 and RED-
MECH-COMPANY6 and RED-TANK-COMPANY3
Reserve: The reserve, BLUE-MECH-COMPANY8, a mechanized infantry company, follows MAIN-EFFORT, and is prepared
to reinforce MAIN-EFFORT.
Security: SUPPORTING-EFFORT1 destroys RED-CSOP1 prior to begin moving across PL-AMBER by MAIN-EFFORT in
order to prevent RED-MECH-REGIMENT2 from observing MAIN-EFFORT.
SUPPORTING-EFFORT2 destroys RED-CSOP2 prior to begin moving across PL-AMBER by MAIN-EFFORT in
order to prevent RED-MECH-REGIMENT2 from observing MAIN-EFFORT.
Deep: Deep operations will destroy RED-TANK-COMPANY1 and RED-TANK-COMPANY2 and RED-TANK-COMPANY3.
Rear: BLUE-MECH-PLT1, a mechanized infantry platoon secures the brigade support area.
Fire: Fires will suppress RED-MECH-COMPANY1 and RED-MECH-COMPANY2 and RED-MECH-COMPANY3 and
RED-MECH-COMPANY4 and RED-MECH-COMPANY5 and RED-MECH-COMPANY6.
End State: At the conclusion of this operation, BLUE-BRIGADE2 will enable accomplishing conducts forward passage of lines
through BLUE-BRIGADE2 by BLUE-ARMOR-BRIGADE1.
MAIN-EFFORT will complete to clear RED-MECH-COMPANY4 and RED-TANK-COMPANY2.
SUPPORTING-EFFORT1 will complete to clear RED-MECH-COMPANY1 and RED-MECH-COMPANY2 and RED-
MECH-COMPANY3 and RED-TANK-COMPANY1.
SUPPORTING-EFFORT2 will complete to clear RED-MECH-COMPANY5 and RED-MECH-COMPANY6 and RED-
TANK-COMPANY3.
Figure 12.11. COA sketch and a fragment of a COA statement (reprinted with permission from
Eric Jones).
Table 12.2, for instance, shows some of the strengths of the COA from Figure 12.11 with
respect to the principle of mass, identified by Disciple-COA.
In addition to generating answers in natural language, Disciple-COA also provides the
reference material based on which the answers are generated, as shown in the bottom part
of Table 12.2. Also, the Disciple-COA agent can provide justifications for the generated
13:54:21,
.013
12.3. Disciple-COA: Course of Action Critiquing 351
Mass the effects of overwhelming combat power at the decisive place and time.
Synchronizing all the elements of combat power where they will have decisive effect on an enemy
force in a short period of time is to achieve mass. To mass is to hit the enemy with a closed fist, not
poke at him with fingers of an open hand. Mass must also be sustained so the effects have staying
power. Thus, mass seeks to smash the enemy, not sting him. This results from the proper
combination of combat power with the proper application of other principles of war. Massing
effects, rather than concentrating forces, can enable numerically inferior forces to achieve decisive
results, while limiting exposure to enemy fire.
Table 12.2 Strengths of the COA from Figure 12.11 with Respect to the Principle of Mass,
Identified by Disciple-COA
Major Strength: There is a major strength in COA411 with respect to mass because
BLUE-TASK-FORCE1 is the MAIN-EFFORT1 and it acts on the decisive point of the COA (RED-
MECH-COMPANY4) with a force ratio of 10.6, which exceeds a recommended force ratio of 3.0.
Additionally, the main effort is assisted by supporting action SUPPRESS-MILITARY-TASK1, which
also acts on the decisive point. This is good evidence of the allocation of significantly more than
minimum combat power required at the decisive point and is indicative of the proper
application of the principle of mass.
Strength: There is a strength in COA411 with respect to mass because BLUE-TASK-FORCE1 is the
main effort of the COA and it has been allocated 33% of available combat power, but this is
considered just a medium-level weighting of the main effort.
Strength: There is a strength in COA411 with respect to mass because BLUE-MECH-COMPANY8 is a
COMPANY-UNIT-DESIGNATION level maneuver unit assigned to be the reserve. This is
considered a strong reserve for a BRIGADE-UNIT-DESIGNATION–level COA and would be
available to continue the operation or exploit success.
Reference: FM 100–5 pg 2–4, KF 113.1, KF 113.2, KF 113.3, KF 113.4, KF 113.5 – To mass is to
synchronize the effects of all elements of combat power at the proper point and time to achieve
decisive results. Observance of the principle of mass may be evidenced by allocation to the main
effort of significantly greater combat power than the minimum required throughout its mission,
accounting for expected losses. Mass is evidenced by the allocation of significantly more than
the minimum combat power required at the decisive point.
answers at three levels of detail, from a very abstract one that shows the general line of
reasoning followed, to a very detailed one that indicates each of the knowledge pieces
used in generating the answer.
13:54:21,
.013
352 Chapter 12. Disciple Agents
simpler one. This process continues until one has enough information to recognize a
weakness or a strength. Consider, for example, the principle of surprise, whose definition
is provided in Table 12.3. As you can see, this is a very general description. How to apply
this general principle in actual situations is knowledge that is learned by military officers
during their lifetime. Therefore, developing an agent able to identify to what extent a
specific COA conforms to the principle of surprise involves capturing and representing the
knowledge of a military expert into the agent’s knowledge base.
Guided by this general definition and the COA from Figure 12.11, our subject matter
expert (Colonel Michael Bowman) has developed the reduction tree from Figure 12.12.
Notice how each successive question identifies a surprise-related feature of the COA until
a strength is recognized.
Through this kind of modeling of the COA critiquing process, each leaf may lead to
the identification of a strength or weakness. Then, the bottom-up solution synthesis
process consists only in accumulating all the identified strengths and weaknesses, just
as in the case of Disciple-WA, where the solution synthesis process accumulates the
elementary actions.
Notice also that, as in the case of Disciple-WA, the tasks are structured, consisting of a
name and a sequence of feature-value pairs.
13:54:21,
.013
12.3. Disciple-COA: Course of Action Critiquing 353
ASSESS-COA-WRT-PRINCIPLE-OF-SURPRISE
FOR-COA COA411
ASSESS-SURPRISE-WRT-THE- ASSESS-SURPRISE-WRT-THE-
PRESENCE-OF-SURPRISE-FACTORS PRESENCE-OF-DECEPTION-ACTIONS
FOR-COA COA411 FOR-COA COA411
ASSESS-SURPRISE-WRT-COUNTERING- ASSESS-SURPRISE-WRT-THE-APPLICATION-
ENEMY-RECONNAISANCE OF-SURPRISING-LEVELS-OF-COMBAT-POWER
FOR-COA COA411 FOR-COA COA411
ASSESS-SURPRISE-WHEN-ENEMY-RECON-IS-PRESENT
FOR-COA COA411
FOR-UNIT RED-CSOP1
FOR-RECON-ACTION SCREEN1
REPORT-STRENGTH-IN-SURPRISE-BECAUSE-OF-COUNTERING-ENEMY-RECON
FOR-COA COA411
FOR-UNIT RED-CSOP1
FOR-RECON-ACTION SCREEN1
FOR-ACTION DESTROY1
WITH-IMPORTANCE “high”
Strength: There is a strength with respect to surprise in COA411 because it contains aggressive
security/counter-reconnaissance plans, destroying enemy intelligence collection units and activities.
Intelligence collection by RED-CSOP1 through SCREEN1 will be disrupted by its destruction by
DESTROY1. This and similar actions prevent the enemy for ascertaining the nature and intent of
friendly operations, thereby increasing the likelihood that the enemy will be surprised. This is a strength
of high importance.
Reference: FM 100-5 pg 2-5, KF 118.1, KF 118.2, KF 118.3 - Surprise is achieved by striking/engaging
the enemy in a time, place or manner for which he is unprepared. The enemy can be surprised by the
tempo of the operation, the size of the force, the direction or location of the main effort, and timing.
Factors contributing to surprise include speed, effective intelligence, deception, application of
unexpected combat power, operations security, and variations in tactics and methods of operation.
by specific features and values. For instance, the bottom part of Figure 12.13 shows
the description of the specific military unit called BLUE-TASK-FORCE1. BLUE-TASK-
FORCE1 is described as being both an ARMORED-UNIT-MILITARY-SPECIALTY and a
MECHANIZED-INFANTRY-UNIT-MILITARY-SPECIALTY. The other features describe
BLUE-TASK-FORCE1 as being at the battalion level; belonging to the blue side; being
designated as the main effort of the blue side; performing two tasks, PENETRATE1 and
CLEAR1; having a regular strength; and having four other units under its operational control.
The values of the features of BLUE-TASK-FORCE1 are themselves described in the same way.
For instance, one of the tasks performed by BLUE-TASK-FORCE1 is PENETRATE1.
13:54:21,
.013
OBJECT
GEOGRAPHICAL-REGION
PLAN
ORGANIZATION
ACTION
EQUIPMENT COA-SPECIFICATION-MICROTHEORY
PURPOSE
MILITARY-EVENT
MODERN-MILITARY-ORGANIZATION
MILITARY-EQUIPMENT MILITARY-PURPOSE
MILITARY-OPERATION
MODERN-MILITARY-UNIT-DEPLOYABLE
MILITARY-TASK
MILITARY-MANEUVER
MANEUVER-UNIT-MILITARY-SPECIALTY AVIATION-UNIT-MILITARY-SPECIALTY
COMPLEX-MILITARY-TASK
SUBCLASS-OF
INFANTRY-UNIT-MILITARY-SPECIALTY
ARMORED-UNIT-MILITARY-SPECIALTY SUBCLASS-OF
MILITARY-ATTACK
MECHANIZED-INFANTRY-UNIT-MILITARY-SPECIALTY
.013
SUBCLASS-OF
BLUE-ARMOR-BRIGADE2
INSTANCE-OF IS-OFFENSIVE-ACTION-FOR
ECHELON-OF-UNIT military offensive
BATALLION-UNIT-DESIGNATION
operation
SOVEREIGN-ALLEGIANCE-OF-ORG BLUE-SIDE RECOMMENDED-FORCE-RATIO
INSTANCE-OF 3
ASSIGNMENT HAS-SURPRISE-FORCE-RATIO
MAIN-EFFORT1 6
TASK
BLUE-TASK-FORCE1 CLEAR1
TASK OBJECT-ACTED-ON
PENETRATE1 RED-MECH-COMPANY4
TROOP-STRENGTH-OF-UNIT
REGULAR-STATUS FORCE-RATIO
10.6
OPERATIONAL-CONTROL-MILITARY-ORG BLUE-MECH-COMPANY1
IS-TASK-OF-OPERATION
OPERATIONAL-CONTROL-MILITARY-ORG ATTACK2
BLUE-MECH-COMPANY2
OPERATIONAL-CONTROL-MILITARY-ORG
BLUE-ARMOR-COMPANY1
OPERATIONAL-CONTROL-MILITARY-ORG
BLUE-ARMOR-COMPANY2
PENETRATE1 is defined as being a penetration task, and therefore inherits all the features of
the penetration tasks, in addition to the features that are directly associated with it.
The hierarchy of objects is used as a generalization hierarchy for learning by the
Disciple-COA agent. For instance, one way to generalize an expression is to replace an
object with a more general one from such a hierarchy. In particular, PENETRATE1 from
the bottom-right side of Figure 12.13 can be generalized to PENETRATE-MILITARY-TASK,
COMPLEX-MILITARY-TASK, MILITARY-MANEUVER, etc. The goal of the learning pro-
cess is to select the right generalization.
The features used to describe the objects are themselves represented in the feature
hierarchy.
13:54:21,
.013
356 Chapter 12. Disciple Agents
ASSESS-COA-WRT-PRINCIPLE-OF-SURPRISE
FOR-COA COA411
R$ACWPOS-001
Which is an aspect that characterizes surprise?
Rule
Enemy reconnaissance Learning
ASSESS-SURPRISE-WRT-COUNTERING-ENEMY-
RECONNAISANCE
FOR-COA COA411
R$ASWCER-002
Is an enemy reconnaissance unit present?
Rule
Yes, RED-CSOP1, which is performing
Learning
the reconnaissance action SCREEN1
ASSESS-SURPRISE-WHEN-ENEMY-RECON-IS-PRESENT
FOR-COA COA411
FOR-UNIT RED-CSOP1
FOR-RECON-ACTION SCREEN1
R$ASWERIP-003
Yes, RED-CSOP1 is destroyed by DESTROY1 Rule
Learning
REPORT-STRENGTH-IN-SURPRISE-BECAUSE-OF-COUNTERING
ENEMY-RECON
FOR-COA COA411
FOR-UNIT RED-CSOP1
FOR-RECON-ACTION SCREEN1
FOR-ACTION DESTROY1
WITH-IMPORTANCE “high”
ASSESS-SURPRISE-WRT-COUNTERING-ENEMY-RECONAISSANCE
FOR-COA COA411
Question:
Is an enemy reconnaissance Explanation:
unit present? RED-CSOP1 SOVEREIGN-ALLEGIANCE-OF-ORG RED-SIDE
RED-CSOP1 TASK SCREEN1
SCREEN1 IS INTELLIGENCE-COLLECTION-MILITARY-TASK
Rule: R$ASWCER-001
TASK ?O3
?O3 IS INTELLIGENCE-COLLECTION-MILITARY-TASK
?O4 IS RED-SIDE
Plausible Lower Bound Condition
?O1 IS COA411
?O2 IS RED-CSOP1
SOVEREIGN-ALLEGIANCE-OF-ORG ?O4
TASK ?O3
?O3 IS SCREEN1
?O4 IS RED-SIDE
FOR-UNIT RED-CSOP1
FOR-RECON-ACTION SCREEN1
13:54:21,
.013
12.3. Disciple-COA: Course of Action Critiquing 357
relations between certain elements from its ontology. The first explanation piece states, in
the formal language of Disciple-COA, that RED-CSOP1 is an enemy unit. The second
explanation piece expresses the fact that RED-CSOP1 is performing the action SCREEN1.
Finally, the last explanation piece expresses the fact that SCREEN1 is a reconnaissance
action. While an expert can understand the meaning of these formal expressions, he
cannot easily define them because he or she is not a knowledge engineer. For one thing,
the expert would need to use the formal language of the agent. But this would not be
enough. The expert would also need to know the names of the potentially many thousands
of concepts and features from the agent’s ontology.
While defining the formal explanations of this task reduction step is beyond the
individual capabilities of the expert or the agent, it is not beyond their joint capabilities.
Finding these explanation pieces is a mixed-initiative process of searching the agent’s
ontology, an explanation piece being a path of objects and relations in this ontology, as
discussed in Section 9.5. In essence, the agent uses analogical reasoning and help from
the expert to identify and propose a set of plausible explanation pieces from which the
expert has to select the correct ones. One explanation generation strategy is based on an
ordered set of heuristics for analogical reasoning. These heuristics exploit the hierarchies
of objects, features, and tasks to identify the rules that are similar to the current
reduction and to use their explanations as a guide to search for similar explanations of
the current example.
From the example reduction and its explanation in Figure 12.15, Disciple-COA auto-
matically generated the plausible version space rule in Figure 12.16. This is an IF-THEN
rule, the components of which are generalizations of the elements of the example in
Figure 12.15.
The rule in Figure 12.16 also contains two conditions for its applicability: a plausible
lower bound condition and a plausible upper bound condition. These conditions approxi-
mate an exact applicability condition that Disciple-COA attempts to learn. Initially, the
plausible lower bound condition covers only the example in Figure 12.15, restricting the
variables from the rule to take only the values from this example. It also includes the
relations between these variables that have been identified as relevant in the explanation
of the example. The plausible upper bound condition is the most general generalization of
the plausible lower bound condition. It is obtained by taking into account the domains and
the ranges of the features from the plausible lower bound condition and the tasks, in order
to determine the possible values of the variables. The domain of a feature is the set of objects
that may have that feature. The range is the set of possible values of that feature. For
instance, ?O2 is the value of the task feature FOR-UNIT, and has as features SOVEREIGN-
ALLEGENCE-OF-ORG and TASK. Therefore, any value of ?O2 has to be in the intersection of
the range of FOR-UNIT, the domain of SOVEREIGN-ALLEGENCE-OF-ORG, and the domain
of TASK. This intersection is MODERN-MILITARY-UNIT-DEPLOYABLE.
The learned rules, such as the one in Figure 12.16, are used in problem solving to
generate task reductions with different degrees of plausibility, depending on which of their
conditions are satisfied. If the plausible lower bound condition is satisfied, then the
reduction is very likely to be correct. If the plausible lower bound condition is not satisfied,
but the plausible upper bound condition is satisfied, then the solution is considered only
plausible. Any application of such a partially learned rule, however, either successful or
not, provides an additional (positive or negative) example, and possibly an additional
explanation, that are used by the agent to improve the rule further through the general-
ization and/or specialization of its conditions.
13:54:21,
.013
358 Chapter 12. Disciple Agents
Rule: R$ASWCER-002
TASK ?O3
?O3 IS INTELLIGENCE-COLLECTION-MILITARY-TASK
?O4 IS RED-SIDE
Figure 12.16. Plausible version space rule learned from the example and the explanation in
Figure 12.15.
Let us consider again the specific task reductions from Figure 12.14. At least for the
elementary tasks, such as the one at the bottom of the figure, the expert needs also to
express them in natural language: “There is a strength with respect to surprise in COA411
because it contains aggressive security/counter-reconnaissance plans, destroying enemy
intelligence collection units and activities. Intelligence collection by RED-CSOP1 will be
disrupted by its destruction by DESTROY1.”
Similarly, the expert would need to indicate the reference (source) material for the
concluded assessment. The learned rules contain generalizations of these phrases that
are used to generate answers in natural language, as illustrated in the bottom part
of Figure 12.12. Similarly, the generalizations of the questions and the answers from
the rules applied to generate a solution are used to produce an abstract justification of
the reasoning process.
As Disciple-COA learns plausible version space rules, it can use them to propose
routine or innovative solutions to the current problems. The routine solutions are those
that satisfy the plausible lower bound conditions of the rules and are very likely to be
correct. Those that are not correct are kept as exceptions to the rule. The innovative
13:54:21,
.013
12.3. Disciple-COA: Course of Action Critiquing 359
solutions are those that do not satisfy the plausible lower bound conditions but satisfy the
plausible upper bound conditions. These solutions may or may not be correct, but in each
case they lead to the refinement of the rules that generated them. Let us consider the
situation illustrated in Figure 12.17. After it has been shown how to critique COA411 with
respect to the principle of security, Disciple-COA is asked to critique COA421. COA421
is similar to COA411, except that in this case the enemy recon unit is not destroyed.
Because of this similarity, Disciple-COA is able to propose the two top reductions in
Figure 12.17. Both of them are innovative reductions that are accepted by the expert.
Therefore, Disciple-COA generalizes the plausible lower bound conditions of the corres-
ponding rules, as little as possible, to cover these reductions and to remain less general or
at most as general as the corresponding plausible upper bound conditions.
The last reduction step in Figure 12.17 has to be provided by the expert because no rule
of Disciple-COA is applicable. We call the expert-provided reduction a creative problem-
solving step. From each such reduction, Disciple-COA learns a new task reduction rule, as
was illustrated in the preceding example.
Through refinement, the task reduction rules may become significantly more complex
than the rule in Figure 12.16. For instance, when a reduction proposed by Disciple-COA is
rejected by the expert, the agent attempts to find an explanation of why the reduction is
wrong. Then the rule may be refined with an Except-When plausible version space
condition. The bounds of this version space are generalizations of the explanations that
should not hold in order for the reduction rule to be applicable.
ASSESS-COA-WRT-PRINCIPLE-OF-SURPRISE
FOR-COA COA421
R$ACWPOS-001
Which is an aspect that characterizes surprise?
Rule
Enemy reconnaissance Refinement
ASSESS-SURPRISE-WRT-COUNTERING-ENEMY-
RECONNAISANCE
FOR-COA COA421
R$ASWCER-002
ASSESS-SURPRISE-WHEN-ENEMY-RECON-IS-PRESENT
FOR-COA COA421
FOR-UNIT RED-CSOP2
FOR-RECON-ACTION SCREEN2
R$ASWERIP-004
13:54:21,
.013
360 Chapter 12. Disciple Agents
In any case, comparing the left-hand side of Figure 12.15 (which is defined by the domain
expert) with the rule from Figure 12.16 (which is learned by Disciple-COA) suggests the
usefulness of a Disciple agent for knowledge acquisition. In the conventional knowledge
engineering approach, a knowledge engineer would need to manually define and debug a
rule such as the one in Figure 12.16. With Disciple, the domain expert needs only to define
an example reduction, because Disciple learns and refines the corresponding rule.
13:54:21,
.013
Metric: Recall (Total Score) Recall Breakdown by Criteria
140.00%
100.00% 100.00%
84.20%
80.00% 80.00%
70.20%
63.71%
56.81%
60.00% 60.00%
40.00% 40.00%
20.00% 20.00%
0.00% 0.00%
Correctness Justification Intelligibility Sources Proactivity Total
Tek/Cyc ISI-Expect GMU ISI-Loom ALL
.013
100.00%
100.00%
90.00%
13:54:21,
Figure 12.18. The performance of the COA critiquers and of the integrated system (reprinted with permission from Eric Jones).
362 Chapter 12. Disciple Agents
Figure 12.19 compares the recall and the coverage of the developed critiquers for the
last three most complex items of the evaluation. For each item, the beginning of each
arrow shows the coverage and recall for the initial testing phase, and the end of the arrow
shows the same data for the modification phase. In this graph, the results that are above
and to the right are superior to the other results. This graph also shows that all the systems
increased their coverage during the evaluation. In particular, the knowledge base of
Disciple-COA increased by 46 percent (from the equivalent of 6,229 simple axioms to
9,092 simple axioms), which represents a very high rate of knowledge acquisition of
286 simple axioms per day.
During August 1999, we conducted a one-week knowledge acquisition experiment with
Disciple-COA, at the U.S. Army Battle Command Battle Lab, in Fort Leavenworth, Kansas,
to test the claim that domain experts who do not have prior knowledge engineering
experience can teach Disciple-COA (Tecuci et al., 2001). The experiment involved four
such military experts and had three phases: (1) a joint training phase during the first three
days, (2) an individual teaching experiment on day four, and (3) a joint discussion of the
experiment on day five. The entire experiment was videotaped. The training for the
experiment included a detailed presentation of Disciple’s knowledge representation,
problem-solving, and learning methods and tools. For the teaching experiment, each
expert received a copy of Disciple-COA with a partial knowledge base. This knowledge
base was obtained by removing the tasks and the rules from the complete knowledge base
of Disciple-COA. That is, the knowledge base contained the complete ontology of objects,
100
TEK/CYC 4
GMU 5
Recall
(Disciple)
80 GMU 4
TEK/CYC 5 (Disciple) ISI (Expect) 5
60
40
20
TEK/CYC 3 ISI (Expect) 3
0
25% 50% 75% 100%
Coverage
Figure 12.19. Coverage versus recall, pre-repair and post-repair (reprinted with permission from
Eric Jones).
13:54:21,
.013
12.3. Disciple-COA: Course of Action Critiquing 363
object features, and task features. We also provided the experts with the descriptions of
three COAs (COA411, COA421, and COA51), to be used for training Disciple-COA. These
were the COAs used in the final phases of the DARPA’s evaluation of all the critiquers.
Finally, we provided and discussed with the experts the modeling of critiquing these COAs
with respect to the principles of offensive and security. That is, we provided the experts
with specific task reductions, similar to the one from Figure 12.17, to guide them in
teaching Disciple-COA. After that, each expert taught Disciple-COA independently while
being supervised by a knowledge engineer, whose role was to help the expert if he or she
reached an impasse while using Disciple-COA.
Figure 12.20 shows the evolution of the knowledge base during the teaching process for
one of the experts, being representative for all the four experts. In the morning, the expert
taught Disciple-COA to critique COAs with respect to the principle of offensive, and in the
afternoon he taught it to critique COAs with respect to the principle of security. In both
cases, the expert used first COA411, then COA422, and then COA51. As one can see from
Figure 12.20, Disciple-COA initially learned more rules, and then the emphasis shifted on
rule refinement. Therefore, the increase in the size of the knowledge base is greater toward
the beginning of the training process for each principle. The teaching for the principle of
offensive took 101 minutes. During this time, Disciple-COA learned fourteen tasks and
fourteen rules (147 simple axioms’ equivalent). The teaching for security took place in the
afternoon and consisted of 72 minutes of interactions between the expert and Disciple-
COA. During this time, Disciple-COA learned fourteen tasks and twelve rules (136 simple
axioms’ equivalent). There was no or very limited assistance from the knowledge engineer
with respect to teaching. The knowledge acquisition rate obtained during the experiment
was very high (approximately nine tasks and eight rules per hour, or ninety-eight simple
axioms’ equivalent per hour). At the end of this training process, Disciple-COA was able to
correctly identify seventeen strengths and weaknesses of the three COAs with respect to
the principles of offensive and security.
400
350
Task+Rule axioms
300
Number of Axioms
250
Task axioms
200
150
Rule axioms
100
50
0
Initial 411 421 51 411 421 51 411 421 51 411 421 51
Figure 12.20. The evolution of the knowledge base during the teaching process.
13:54:21,
.013
364 Chapter 12. Disciple Agents
After the experiment, each expert was asked to fill in a detailed questionnaire designed
to collect subjective data for usability evaluation. All the answers took into account that
Disciple-COA was a research prototype and not a commercial product and were rated
based on a scale of agreement with the question from 1 to 5, with 1 denoting “not at all”
and 5 denoting “very”. For illustration, Table 12.4 shows three questions and the answers
provided by the four experts.
In conclusion, Disciple-COA demonstrated the generality of its learning methods that
used an object ontology created by another group, namely Teknowledge and Cycorp
(Boicu et al., 1999). It also demonstrated high rule learning rates, as compared with
manual definition of rules, and better performance than the evaluating experts, with many
unanticipated solutions.
Questions Answers
Do you think that Disciple is a Rating 5. Absolutely! The potential use of this tool by
useful tool for knowledge domain experts is only limited by their imagination – not
acquisition? their AI programming skills.
5
4
Yes, it allowed me to be consistent with logical thought.
13:54:21,
.013
12.4. Disciple-COG: Center of Gravity Analysis 365
introduced by Carl von Clausewitz (1832) as “the foundation of capability, the hub of all
power and movement, upon which everything depends, the point against which all the
energies should be directed.” It is currently defined as comprising the source of power
that provides freedom of action, physical strength, and will to fight (Joint Chiefs of Staff,
2008, IV–X).
It is recognized that, “Should a combatant eliminate or influence the enemy’s strategic
center of gravity, the enemy would lose control of its power and resources and eventually
fall to defeat. Should a combatant fail to adequately protect his own strategic center of
gravity, he invites disaster” (Giles and Galvin, 1996, p. 1). Therefore, the main goal of any
force should be to eliminate or influence the enemy’s strategic center of gravity while
adequately protecting its own.
Correctly identifying the centers of gravity of the opposing forces is of highest import-
ance in any conflict. Therefore, all the U.S. senior military service colleges emphasize
center of gravity analysis in the education of strategic leaders (Warden, 1993; Echevarria,
2003; Strange and Iron, 2004a, 2004b; Eikmeier, 2006).
In spite of the apparently simple definition of the center of gravity, its determination
requires a wide range of background knowledge, not only from the military domain but
also from the economic, geographic, political, demographic, historic, international, and
other domains (Giles and Galvin, 1996). In addition, the adversaries involved, their goals,
and their capabilities can vary in important ways from one situation to another. When
performing this analysis, some experts may rely on their own professional experience and
intuitions without following a rigorous approach.
Recognizing these difficulties, the Center for Strategic Leadership of the U.S. Army War
College started an effort in 1993 to elicit and formalize the knowledge of a number of
experts in center of gravity. This research resulted in a COG monograph (Giles and Galvin,
1996). This monograph made two significant contributions to the theory of center of
gravity analysis. The first was a systematic analysis of the various factors (e.g., politic,
military, economic, etc.) that have to be taken into account for center of gravity determin-
ation. The second significant contribution was the identification of a wide range of center
of gravity candidates.
A significant advancement of the theory of center of gravity analysis was the CG-CC-
CR-CV model introduced by Strange (1996) and summarized by the following definitions:
Building primarily on the work of Strange (1996) and Giles and Galvin (1996), we have
developed a computational approach to center of gravity analysis, which is summarized in
Figure 12.21.
13:54:21,
.013
366 Chapter 12. Disciple Agents
Given: A strategic situation (e.g., the invasion of Iraq by the U.S.-led coalition in 2003).
Determine: The strategic centers of gravity of the opposing forces and their critical
vulnerabilities.
This approach consists of three main phases: assessment of the strategic situation,
identification of center of gravity candidates, and testing of the identified candidates.
During the assessment of the situation (such as the invasion of Iraq by the U.S.-led
coalition in 2003), one assembles and assesses data and other relevant aspects of the
strategic environment, including the opposing forces (Iraq, on one side, and the U.S.-led
coalition, on the other side), their strategic goals, political factors (e.g., type of government,
governing bodies), military factors (e.g., leaders, will, and capability), psychosocial factors
(e.g., motivation, political activities), economic factors (e.g., type of economy, resources),
and so on. This assessment will be used in the next phases of center of gravity analysis.
During the identification phase, strategic center of gravity candidates are identified
from a belligerent’s elements of power, such as its leadership, government, military,
people, or economy. For example, a strong leader, such as Saddam Hussein or George
W. Bush, could be a center of gravity candidate with respect to the situation at the
beginning of the Iraq War in 2003. The result of this phase is the identification of a wide
range of candidates.
During the testing phase, each candidate is analyzed to determine whether it has all the
critical capabilities that are necessary to be the center of gravity. For example, a leader
needs to be secure; informed; able to maintain support from the government, the military,
and the people; and irreplaceable. For each capability, one needs to determine the
existence of the essential conditions, resources, and means that are required by that
capability to be fully operative. For example, some of the protection means of Saddam
Hussein were the Republican Guard Protection Unit, the Iraqi Military, the Complex of
Iraqi Bunkers, and the System of Saddam doubles. Once these means of protection are
identified, one needs to determine whether any of them, or any of their components, are
vulnerable. For example, the Complex of Iraqi Bunkers is vulnerable because their location
and design are known to the U.S.-led coalition and could be destroyed.
Based on the results of the analysis, one can eliminate any center of gravity candidate
that does not have all the required critical capabilities, and select the centers of gravity
13:54:21,
.013
12.4. Disciple-COG: Center of Gravity Analysis 367
from the remaining candidates. Moreover, the process also identifies the critical vulner-
abilities of the selected centers of gravity.
An important characteristic of this approach is that it is both natural for a human and
appropriate for automatic processing. By using this approach, we have developed the
Disciple-COG agent, briefly described in the following section.
13:54:21,
.013
368 Chapter 12. Disciple Agents
Figure 12.22 shows the initial interaction screen after the user has clicked on “situ-
ation.” The right-hand side shows the prompts of Disciple-COG and the information
provided by the user, such as:
Once the user names the opposing forces (i.e., “Allied Forces 1943” and “European Axis
1943”), Disciple-COG includes them into the table of contents, as shown in the left-hand
side of Figure 12.22. Then, when the user clicks on one of the opposing forces (e.g., “Allied
Forces 1943”), Disciple-COG asks for its characteristics, as indicated in the right-hand side
of Figure 12.23 (e.g., “What kind of force is Allied Forces 1943?”). Because the user
characterized “Allied Forces 1943” as a multistate force (by clicking on one of the options
offered by the agent), Disciple-COG further asks for its members and extends the table of
contents with the provided names (i.e., “US 1943,” “Britain 1943,” “USSR 1943,” etc.) and
their relevant aspects (i.e., “Strategic goal,” “Political factors,” “Military factors,” etc.), as
shown in the left-hand side of Figure 12.23. The user can now click on any such aspect and
will be asked specific questions by Disciple-COG.
Thus, the user’s answers lead to the generation of new items in the left-hand side of the
window, and trigger new questions from the agent, which depend on the answers pro-
vided by the user. Through such context-dependent questions, Disciple-COG guides the
user to research, describe, and assess the situation.
As will be discussed in Section 12.4.4, once the user describes various aspects of
the situation, Disciple-COG automatically extends its ontology with the corresponding
13:54:21,
.013
12.4. Disciple-COG: Center of Gravity Analysis 369
representations. The user is not required to answer all the questions, and Disciple-COG
can be asked, at any time, to identify and test the strategic center of gravity candidates for
the current description of the situation. The COG analysis process uses the problem
reduction and solution synthesis paradigm presented in Section 4.2, following the CG-
CC-CR-CV model (Strange, 1996).
Figure 12.24 shows the interface of the Mixed-Initiative Reasoner of Disciple-COG that
displays the automatically generated analysis. The left-hand side shows an abstract view of
the analysis tree for the problem “Analyze the strategic COG candidates for the WW II
Europe 1943 situation.”
First, the problem of analyzing the strategic COG candidates for this situation is
reduced to analyzing the COG candidates for each of the two opposing forces. Then,
because each of the opposing forces is a multimember force, the problem of analyzing
the COG candidates for an opposing force is reduced to two other problems: (1) the
problem of analyzing the COG candidates for each member of the multimember force
(e.g., US 1943 candidates, Britain 1943 candidates, and USSR 1943 candidates, in the case
of Allied Forces 1943) and (2) the problem of analyzing the multimember COG
candidates.
Continuing, the problem of analyzing the US 1943 candidates is reduced to analyzing
the COG candidates with respect to the main elements of power of US 1943, namely
people of US 1943, government of US 1943, armed forces of US 1943, and economy of
US 1943.
Because the abstract problem “US 1943 candidates” is selected in the left-hand side of
the interface of the Mixed-Initiative Reasoner, the right-hand side shows the detailed
description of the corresponding reduction tree. Notice that the detailed tree shows both
complete problem descriptions and the question/answer pairs that guide their reductions.
The leaves of the detailed tree correspond to the abstract subproblems of “US 1943
candidates,” such as “Candidates wrt people of US 1943.”
The user can browse the entire analysis tree generated by Disciple-COG by clicking on
the nodes and the plus (+) and minus (–) signs. For example, Figure 12.25 shows how the
Figure 12.24. Abstract (left) and detailed (right) COG reduction tree.
13:54:21,
.013
370 Chapter 12. Disciple Agents
Figure 12.25. Reduction tree for testing a national leader as a COG candidate.
problems of analyzing the COG candidates with respect to the main elements of power of
US 1943 (government of US 1943, and armed forces of US 1943) are reduced to identifying
and testing specific COG candidates (i.e., President Roosevelt and military of US 1943).
Testing each of the identified COG candidates is reduced to the problems of testing
whether it has all the necessary critical capabilities. Thus, testing President Roosevelt as a
potential COG candidate is reduced to seven problems, each testing whether President
Roosevelt has a certain critical capability, as shown in the left side of Figure 12.25.
The left-hand side of Figure 12.26 shows how testing of whether a COG candidate has a
certain critical capability is reduced to the testing of whether the corresponding critical
requirements are satisfied. In particular, testing of whether President Roosevelt has the
critical capability to stay informed is reduced to the problem of testing of whether he has
means to receive essential intelligence. These means are identified as US Office of Strategic
Services 1943, US Navy Intelligence 1943, and US Army Intelligence 1943. Consequently, the
user is asked to assess whether each of them has any significant vulnerability. The user
clicks on one of the means (e.g., US Office of Strategic Services 1943 in the left-hand side of
Figure 12.26) and the agent displays two alternative solution patterns for its assessment, in
the right-hand side of Figure 12.26:
13:54:21,
.013
12.4. Disciple-COG: Center of Gravity Analysis 371
Figure 12.26. Assessing whether a critical requirement has any significant vulnerability.
The user has to complete the instantiation of one of the two patterns and then click on the
corresponding Save button. In this case, the provided solution is the following one:
Up to this point we have presented the automatic generation of the top-down COG
reduction tree and the evaluation of the elementary problems (i.e., potential vulnerabil-
ities) by the user.
The next stage of the COG analysis process is the bottom-up automatic synthesis of the
elementary solutions. This will be illustrated in the following, starting with Figure 12.27,
which shows how the assessments of the President Roosevelt’s individual means to receive
essential intelligence (i.e., US Office of Strategic Services 1943, US Navy Intelligence 1943, and
US Army Intelligence 1943) are combined to provide an overall assessment of his means to
receive essential intelligence:
President Roosevelt has means to receive essential intelligence (US Office of Strategic
Services 1943, US Navy Intelligence 1943, and US Army Intelligence 1943). The US Office
of Strategic Services 1943 has the following significant vulnerability: There is a huge
amount of information that needs to be collected and analyzed by the US Office of
Strategic Services 1943.
The leaf solutions in Figure 12.27 have a yellow background to indicate that they are
assessments made by the user. The top-level solution obtained through their combination
has a green background to indicate that it was automatically computed by Disciple-COG,
based on a previously learned synthesis rule. This rule indicates the pattern of the solution
and how it is obtained by combining elements of the patterns of the subsolutions. In
particular, notice that the means from individual solutions are gathered into a single list.
13:54:21,
.013
372 Chapter 12. Disciple Agents
Figure 12.27. Obtaining the overall assessment of a critical requirement by combining individual
assessments.
The next bottom-up solution synthesis step is to obtain the assessment of a critical
capability by combining the assessments of its critical requirements. Figure 12.28, for
instance, shows the assessment of President Roosevelt’s critical capability to maintain
support, obtained by combining the assessments of the corresponding means (i.e., means
to secure support from the government, means to secure support from the military, and
means to secure support from the people). This synthesis operation was previously
explained in Section 4.2 and illustrated in Figure 4.8 (p. 117).
Next Disciple-COG obtains the assessment of a COG candidate based on the assess-
ments of its critical capabilities, as illustrated in Figure 12.29:
All the identified COG candidates from the analyzed situation are evaluated in a similar
way, and the final solution is a summary of the results of these evaluations, as illustrated
in Figure 12.30. In particular, for the WW II Europe 1943 situation, the solution is
the following one:
For European Axis 1943, choose the strategic center of gravity from the following
candidates: military of Germany 1943 and industrial capacity of Germany 1943. For
Allied Forces 1943, choose the strategic center of gravity from the following
candidates: military of USSR 1943, financial capacity of USSR 1943, industrial capacity
of USSR 1943, will of the people of Britain 1943, military of Britain 1943, financial
capacity of Britain 1943, will of the people of US 1943, military of US 1943, and industrial
capacity of US 1943.
The subsolutions of this top-level solution indicate all the COG candidates considered and
why several of them have been eliminated.
13:54:21,
.013
12.4. Disciple-COG: Center of Gravity Analysis 373
Figure 12.28. Obtaining the overall assessment of a critical capability based on its critical
requirements.
Figure 12.29. Obtaining the assessment of a COG candidate based on its critical capabilities.
13:54:21,
.013
374 Chapter 12. Disciple Agents
Figure 12.30. Result of the evaluation of the COG candidates corresponding to a situation.
At the end of the analysis, Disciple-COG generates a draft analysis report, a fragment of
which is shown in Figure 12.31. The first part of this report contains a description of the
strategic situation that is generated from the information provided and assessed by the
user, as illustrated in Figures 12.22 and 12.23. The second part of the report includes all the
center of gravity candidates identified by Disciple-COG, together with their analyses, as
previously discussed. The user may now finalize this report by examining the analysis of
each center of gravity candidate and by completing, correcting, or even rejecting it and
providing a different analysis.
Successive versions of Disciple-COG have been used for ten years in courses at the
U.S. Army War College (Tecuci et al., 2008b). It has also been used at the Air War College
and the Joint Forces Staff College. The use of Disciple-COG in such an educational
environment is productive for several reasons. First, the user is guided in performing a
detailed and systematic assessment of the most important aspects of a strategic situation,
which is necessary in order to answer Disciple-COG’s questions. Second, the agent
generates its solutions by employing a systematic analysis, which was learned from a
military expert. Therefore, the user can learn how to perform a similar analysis from
Disciple-COG. Third, the details of the analysis and the actual results reflect the personal
judgment of the user, who has unique military experiences and biases and has a
personal interpretation of certain facts. Thus, the analysis is unique to the user, who
can see how his or her understanding of the situation determines the results yielded by
Disciple-COG.
It is important to note, however, that the solutions generated by Disciple-COG must be
critically analyzed at the end.
13:54:21,
.013
12.4. Disciple-COG: Center of Gravity Analysis 375
13:54:21,
.013
376 Chapter 12. Disciple Agents
object
international factor
13:54:21,
.013
12.4. Disciple-COG: Center of Gravity Analysis 377
political factor
controlling group
feudal deity
god king monarchy religious
figure ruling
government leader secret political
police party
dictator god
king political
totalitarian theocratic cabinet
government government
or staff
democratic
monarch
government autocratic military
leader religious
leader
body
theocratic chief and
police
democracy tribal council
state
conventional
commander
military religious political military law enforcement
in chief
dictatorship dictatorship leader staff organization
When the user starts using the agent, Disciple-COG elicits the description of the situation
or scenario to be analyzed, as was illustrated at the beginning of Section 12.4.2 and in
Figures 12.22 (p. 367) and 12.23 (p. 368). Scenario elicitation is guided by elicitation scripts
that are associated with the concepts and features from the ontology of Disciple-COG. For
example, the elicitation script for the feature has as opposing force is shown in Figure 12.35.
The script indicates the question to be asked (“Name the opposing forces in <scenario
name>:”), the variable that will hold the answer received from the user (“<opposing
force>”), the graphical appearance of the interface (“multiple line, height 4”), the way the
ontology will be extended with the elicited opposing force (“<opposing force> instance of
opposing force,” and “<scenario name> has as opposing force <opposing force>”), and the
next script to call (“Elicit properties of the instance <opposing force> in new window”).
An illustration of the execution of this script was provided at the bottom of Figure 12.22
(p. 367). Figure 12.36 shows the effect of this execution on the ontology of Disciple-COG.
Before script execution, the relevant part of the ontology is the one from the top of
Figure 12.36. The execution of the script causes Disciple-COG to prompt the user as
follows: “Name the opposing forces in WW II Europe 1943.” Once the user provides these
names (“Allied Forces 1943” and “European Axis 1943”), Disciple-COG introduces them as
instances of opposing force, and connects the “WW II Europe 1943” scenario to them, as
indicated in the script and illustrated at the bottom part of Figure 12.36.
Disciple-COG contains a Script Editor that allows easy definition of the elicitation
scripts by a knowledge engineer (KE). Figure 12.37 shows how the KE has defined the
script from Figure 12.35. The are a few differences in the naming of some entities in
13:54:21,
.013
378 Chapter 12. Disciple Agents
domain agent
has as controlling leader
range
person
domain force
has as commander in chief
range
person
Figure 12.34. Fragment of the hierarchy of features corresponding to the ontology of controlling
leaders.
feature
scenario
domain
Script type: Elicit the feature has as opposing force for an instance <scenario name>
Controls:
Queson: Name the opposing forces in <scenario name>:
Answer variable: <opposing force>
Control type: multiple line, height 4
Ontology acons:
<opposing force> instance of opposing force
<scenario name> has as opposing force <opposing force>
Script calls:
Elicit properes of the instance <opposing force> in new window
Figure 12.37, corresponding to an older version of the Disciple system. In the Feature
Hierarchy Browser, the KE has selected the feature has_as_opposing_force, and then has
clicked on the Script button. As a result, an editor for the elicitation script was opened, as
shown in the right-hand side of Figure 12.37. Then the KE has selected the type of script to
13:54:21,
.013
12.4. Disciple-COG: Center of Gravity Analysis 379
object
subconcept of
scenario force
subconcept of
opposing force
instance of
WW II Europe 1943
object
subconcept of
scenario force
subconcept of
opposing force
instance of
instance of
Figure 12.36. The effect of the execution of an elicitation script on the ontology.
Next, in the “Ontology actions” panel shown in the middle-right of Figure 12.37, the KE has
indicated how the ontology will be extended with the elicited values of the specified variables:
Finally, as shown at the bottom-right of Figure 12.37, the KE has indicated the script
to be called after the execution of the current script (“elicit properties of an instance”),
how will it be displayed (in a new window), and its parameters (<opposing-force-name>
and Opposing_force).
13:54:21,
.013
380 Chapter 12. Disciple Agents
13:54:21,
.013
12.4. Disciple-COG: Center of Gravity Analysis 381
We need to
1. Modeling Analyze the strategic COG candidates for WWII Europe 1943. 2. Learning
Which is an opposing force in the WWII Europe 1943 scenario?
Rule1
Allied Forces 1943.
Therefore we need to
Learns task
Analyze the strategic COG candidates for Allied Forces 1943. reduction
rules
Provides What type of force is Allied Forces 1943?
examples
Allied Forces 1943 is a multimember force. Rule2
of correct
task Therefore we need to
reduction Analyze the strategic COG candidates for Allied Forces 1943
which is a multimember force.
steps
What type of center of gravity should we
consider for this multimember force?
Rule3
We consider one corresponding to a member of it.
Therefore we need to
Analyze the strategic COG candidates for a
member of Allied Forces 1943.
The question and its answer from the problem reduction step represent the expert’s
reason (or explanation) for performing that reduction. Because they are in natural lan-
guage, the expert has to help Disciple-COG “understand” them in terms of the concepts
and the features from the object ontology. For instance, the meaning of the question/
answer pair from the example in Figure 12.39 (i.e., “Which is a member of Allied Forces
1943? US 1943”) is “Allied Forces 1943 has as member US 1943.”
13:54:21,
.013
382 Chapter 12. Disciple Agents
Based on the example and its explanation from the left-hand side of Figure 12.39,
Disciple-COG learns the rule from the right-hand side of Figure 12.39.
The structure of the rule is generated from the structure of the example where each
instance (e.g., Allied Forces 1943) and each constant (if present) is replaced with a variable
(e.g., ?O1). The variables are then used to express the explanation of the example as a very
specific applicability condition of the rule, as shown in the bottom-left part of Figure 12.39.
Finally, the plausible version space condition of the rule is generated by generalizing
the specific applicability condition in two ways.
The plausible lower bound condition is the minimal generalization of the specific
condition, which does not contain any specific instance. This generalization is performed
in the context of the agent’s ontology, in particular the ontology fragment shown in
Figure 10.22 (p. 323). The least general concepts from the ontology in Figure 10.22 that
cover Allied Forces 1943 are opposing force and equal partners multistate alliance. However,
Allied Forces 1943 has the feature has as member, and therefore any of its generalizations
should be in the domain of this feature, which happens to be multimember force. As a
consequence, the minimal generalization of Allied Forces 1943 is given by the following
expression:
Similarly (but using the range of the has as member feature, which is force), Disciple-COG
determines the minimal generalizations of US 1943 as follows:
The reason the lower bound cannot contain any instance is that the learned rule will be
used by Disciple-COG in other scenarios (such as Afghanistan 2001–2002), where the
instances from WWII Europe 1943 do not exist, and Disciple-COG would not know how
to generalize them.
13:54:21,
.013
12.4. Disciple-COG: Center of Gravity Analysis 383
The plausible upper bound condition is the maximal generalization of the specific
condition and is generated in a similar way. In particular, the maximal generalization of
Allied Forces 1943 is given by the following expression:
As Disciple-COG learns new rules from the expert, the interaction between the expert and
Disciple-COG evolves from a teacher–student interaction toward an interaction where
both collaborate in solving a problem. During this mixed-initiative problem-solving phase,
Disciple-COG learns not only from the contributions of the expert, but also from its own
successful or unsuccessful problem-solving attempts, which lead to the refinement of the
learned rules.
As indicated in Figure 12.38, Disciple-COG applied Rule 4 to reduce the task, “Analyze
the strategic COG candidates for a member of European Axis 1943,” generating an example
that is covered by the plausible upper bound condition of the rule. This reduction was
accepted by the expert as correct. Therefore, Disciple-COG generalized the plausible lower
bound condition to cover it. For instance, European Axis 1943 is a multimember force, but it
is not an equal partners multistate alliance. It is a dominant partner multistate alliance
dominated by Germany 1943, as can be seen in Figure 10.22 (p. 323). As a consequence,
Disciple-COG automatically generalizes the plausible lower bound condition of the rule to
cover this example. The refined rule is shown in the left-hand side of Figure 12.40. This
refined rule is then generating the task reduction from the bottom part of Figure 12.38.
Although this example is covered by the plausible lower bound condition of the rule, the
expert rejects the reduction as incorrect. This shows that the plausible lower bound
condition is not less general than the concept to be learned, and it would need to be
specialized.
This rejection of the reduction proposed by Disciple-COG initiates an explanation
generation interaction during which the expert will have to help the agent understand
why the reduction step is incorrect. The explanation of this failure is that Finland 1943 has
only a minor military contribution to European Axis 1943 and cannot, therefore, provide the
center of gravity of this alliance. The actual failure explanation (expressed with the terms
from the object ontology) has the form:
Finland 1943 has as military contribution military contribution of Finland 1943 is minor
military contribution
Based on this failure explanation, Disciple-COG generates a plausible version space for an
Except When condition and adds it to the rule, as indicated on the right-hand side of
Figure 12.40. In the future, this rule will apply only to situations where the main condition
is satisfied and the Except When condition is not satisfied.
13:54:21,
.013
384 Chapter 12. Disciple Agents
IF IF
Analyze the strategic COG candidates for a Analyze the strategic COG candidates for a
member of ?O1. member of ?O1.
decision-support assistant in courses or individual lectures at the U.S. Army War College,
Air War College, Joint Forces Staff College, U.S. Army Intelligence Center, and other
civilian, military, and intelligence institutions (Tecuci et al., 2002a; 2002b). In particular,
successive versions of Disciple-COG have been used for elective courses and have been
part of the U.S. Army War College curriculum, uninterruptedly, since 2001, for a decade.
The textbook Agent-Assisted Center of Gravity Analysis (Tecuci et al., 2008b) provides a
detailed presentation of this agent, the embodied theory for COG determination that is
consistent with the joint military doctrine, and the use of this agent for the education
of strategic leaders. It includes a CD with lecture notes and the last version of the agent
(see lac.gmu.edu/cog-book/).
Each year, after being used in one or two courses, Disciple-COG was evaluated by the
students. The following, for instance, describes the evaluation results obtained in one of
these courses taught at the U.S. Army War College.
Each military student used a copy of the trained Disciple-COG agent as an intelligent
assistant that helped him or her to develop a center of gravity analysis of a war scenario.
As illustrated in Section 12.4.2, each student interacted with the scenario elicitation
module that guided him or her to describe the relevant aspects of the analyzed scenario.
Then the student invoked the autonomous problem solver (which used the rules learned
13:54:21,
.013
12.4. Disciple-COG: Center of Gravity Analysis 385
acceptable. 30
incorrect. 4
complete. 62
relatively complete. 43
significantly incomplete. 5
easy to understand. 71
relatively understandable. 36
difficult to understand. 3
0 20 40 60 80 100
13:54:21,
.013
386 Chapter 12. Disciple Agents
Disagree
Neutral
Agree
Strongly Agree
Strongly Disagree
Disagree
Neutral
Agree
Strongly Agree
Strongly Disagree
Disagree
Neutral
Agree
Strongly Agree
It was easy to Disciple should be used in A system like Disciple could be
8 8 8
7 use Disciple 7
future versions of this course 7 used in other USAWC courses
6 6 6
5 5 5
4 4 4
3 3 3
2 2 2
1 1 1
0 0 0
Strongly Disagree
Disagree
Neutral
Agree
Strongly Agree
Strongly Disagree
Disagree
Neutral
Agree
Strongly Agree
Strongly Disagree
Disagree
Neutral
Agree
Strongly Agree
Figure 12.42. Global evaluation results from a COG class experiment at the U.S. Army War
College.
Figure 12.43. Global evaluation results from a COG class experiment at the Air War College.
13:54:21,
.013
12.5. Disciple-VPT 387
12.5.1 Introduction
While most of this book has focused on the development of agents for evidence-
based reasoning, the previous sections of this chapter have shown the generality of
the knowledge representation, reasoning, and learning methods of the Disciple
approach by describing other types of Disciple agents. This section takes a further
step in this direction by presenting a different type of Disciple architecture, called
Disciple-VPT (virtual planning team). Disciple-VPT (Tecuci et al., 2008c) consists of
virtual planning experts that can collaborate to develop plans of actions requiring
expertise from multiple domains. It also includes an extensible library of virtual
planning experts from different domains. Teams of such virtual experts can be rapidly
assembled from the library to generate complex plans of actions that require their
joint expertise. The basic component of the Disciple-VPT tool is the Disciple-VE
(virtual experts) learning agent shell that can be taught directly by a subject matter
expert how to plan, through planning examples and explanations, in a way that is
similar to how the expert would teach an apprentice. Copies of the Disciple-VE shell
can be used by experts in different domains to rapidly populate the library of virtual
experts of Disciple-VPT.
A virtual planning expert is defined as a knowledge-based agent that can rapidly
acquire planning expertise from a subject matter expert and can collaborate with other
virtual experts to develop plans that are beyond the capabilities of individual virtual
experts.
In this section, by planning, we mean finding a partially ordered set of elementary
actions that perform a complex task (Ghallab et al., 2004).
A representative application of Disciple-VPT is planning the response to emergency
situations, such as the following ones: a tanker truck leaking toxic substance near a
residential area; a propane truck explosion; a biohazard; an aircraft crash; a natural
disaster; or a terrorist attack (Tecuci et al., 2007d). The U.S. National Response Plan
(DHS, 2004) identifies fifteen primary emergency support functions performed by federal
agencies in emergency situations. Similarly, local and state agencies undertake these
functions responding to such emergencies without or before any federal assistance is
provided. Each such function defines an expertise domain, such as emergency manage-
ment; police operations; fire department operations; hazardous materials handling; health
and emergency medical services; sheltering, public works, and facilities; and federal law
enforcement. In this case, the library of Disciple-VPT will include virtual experts corres-
ponding to these domains.
The next section presents the general architecture of Disciple-VPT and discusses the
different possible uses of this general and flexible tool. Section 12.5.3 describes a sample
scenario from the emergency response planning area, which is used to present the
features of Disciple-VPT. Section 12.5.4 presents the architecture of the Disciple-VE
learning agent shell, which is the basis of the capabilities of Disciple-VPT, including its
learning-oriented knowledge representation. Section 12.5.5 presents the hierarchical task
network (HTN) planning performed by the Disciple virtual experts. After that, Section
12.5.6 presents a modeling language and methodology developed to help a subject matter
13:54:21,
.013
388 Chapter 12. Disciple Agents
expert explain to a Disciple-VE agent how to plan, by using the task reduction paradigm.
Section 12.5.7 discusses how a Disciple-VE agent can perform complex inferences as part
of a planning process. The next two sections, 12.5.8 and 12.5.9, present the teaching and
learning methods of Disciple-VE, first for inference tasks and then for planning tasks.
Section 12.5.10 presents the organization of the library of virtual experts of Disciple-VPT.
After that, Section 12.5.11 presents Disciple-VPT’s approach to multi-agent collaboration.
Section 12.5.12 discusses the development of two virtual experts, one for fire operations
and the other for emergency management. Section 12.5.13 presents some evaluation
results, and Section 12.5.14 summarizes our research contributions.
The user interacts with the VE Assistant to specify a situation and the profiles of several
human experts who may collaborate to plan the achievement of various goals in that
situation. Next, a team of virtual planning experts with similar profiles is automatically
assembled from the VE Library. This VE Team then simulates the planning performed by
the human experts, generating plans for achieving various goals in the given situation.
The goal of a system such as Disciple-VPT is to allow the development of collaborative
planners for a variety of applications by populating its library with corresponding virtual
VE Library
Disciple-VE
Disciple-VE
Disciple-VE
Disciple-VE
Disciple-VE
Disciple-VE
Disciple-VE
Disciple-VE
Disciple-VE
User
VE Assistant
VE Team
Disciple-VE Disciple-VE
DISTRIBUTED
KNOWLEDGE BASE
KB
Disciple-VE Disciple-VE
KB KB
KB KB KB
Disciple-VE Disciple-VE
13:54:21,
.013
12.5. Disciple-VPT 389
experts. For instance, planning the response to emergency situations requires virtual
experts for emergency management, hazardous materials handling, federal law enforce-
ment, and so on. Other application areas, such as planning of military operations, require
a different set of virtual experts in the VE Library. Moreover, for a given type of task and
application area, different multidomain planning systems can be created by assembling
different teams of virtual experts.
There are many ways in which a fully functional Disciple-VPT system can be used for
training or actual planning assistance. For instance, in the context of emergency response
planning, it can be used to develop a wide range of training scenarios by guiding the user
to select between different scenario characteristics. Disciple-VPT could also be used to
assemble teams of virtual planning experts who can demonstrate and teach how people
should plan the response to various emergency situations. Another approach is to assem-
ble combined teams that include both people and virtual experts. The team members will
then collaborate in planning the response to the generated emergency scenario. In a
combined team, human responders can play certain emergency support functions by
themselves or can play these functions with the assistance of corresponding virtual
experts. During the training exercise, a responder who has a certain emergency support
function will learn how to perform that function from a corresponding virtual expert with
higher competence. The responder will also learn how to collaborate with the other
responders or virtual experts who perform complementary support functions.
The Disciple-VPT approach to expert problem solving extends significantly the applic-
ability of the classical expert systems (Buchanan and Wilkins, 1993; Durkin, 1994; Awad,
1996; Jackson, 1999; Awad and Ghaziri, 2004). Such an expert system is limited to a narrow
expertise domain, and its performance decreases dramatically when attempting to solve
problems that have elements outside its domain of expertise. On the contrary, a Disciple-
VPT type system can efficiently solve such problems by incorporating additional virtual
experts. Because many expert tasks actually require collaboration with other experts, a
Disciple-VPT–type system is more suitable for solving real-world problems.
The next section introduces in more detail a scenario from the emergency response
planning area that informed the development of Disciple-VPT.
Workers at the Propane bulk storage facility in Gainsville, Virginia, have been
transferring propane from a train car to fill one of two 30,000 gallon bulk
storage tanks. A fire is discovered in the fill pipe at the bulk tank, and a large
fire is developing. The time is 15:12 on a Wednesday in the month of May. The
temperature is 72 degrees and there is a light breeze out of the west. The
roads are dry and traffic volume is moderate. The fire department is summoned
to the scene five minutes after the fire started. The facility is located in a
rapidly growing area 2,000 feet from an interstate highway and 200 feet from
two heavily traveled U.S. highways. New shopping centers have popped up in
the area, including grocery stores, large box building supply facilities, and large
box retail facilities. As always, these facilities are accompanied by fast
food restaurants and smaller retail stores. Residential concentrations include
13:54:21,
.013
390 Chapter 12. Disciple Agents
approximately 2,400 residents. The Local Emergency Operations Plan has all
the required components, including public information, communications, and
sheltering. Shelters utilize schools managed by the Red Cross. The Virginia
Department of Transportation provides highway services.
Planning the appropriate response to this emergency situation requires the collaboration
of experts in fire department operations, emergency management, and police operations.
The generated plan will consist of hundreds of partially ordered actions.
One group of actions deals with the arrival of resources, such as fire units, emergency
management services units, police units, as well as individuals with different areas of
expertise (e.g., emergency manager, safety officer, highway supervisor, planning officer,
training logistics officer, public information officer).
Another group of actions deals with the establishment of the structure of the Incident
Command System (ICS) and the allocation of resources based on the evaluation of the
situation. The structure of the ICS follows the standard U.S. National Incident Manage-
ment System (FEMA, 2007). The National Incident Management System establishes
standard incident management processes, protocols, and procedures so that all local,
state, federal, and private-sector emergency responders can coordinate their responses,
share a common focus, and more effectively resolve events. Its main components are the
unified command, the command staff, and the general staff. The structure and organiza-
tion of these components depend on the current situation. For example, in the case of the
preceding scenario, the unified command includes representatives from the fire depart-
ment, police department, highway department, and propane company. The command
staff includes a safety officer, a public information officer, and a liaison officer. The general
staff includes an operation section, a planning section, a logistics section, and a finance
and administration section. Each of these sections is further structured and staffed.
Yet other groups of actions deal with the various activities performed by the compon-
ents of the Incident Command System. For instance, in the case of the preceding scenario,
the fire management group may perform the cooling of the propane tank with water. The
evacuation branch may evacuate the Gainsville hot zone. The emergency manager may
arrange for transportation, sheltering, and emergency announcements to support the
evacuation. The Gainsville perimeter control branch implements the perimeter control
for the Gainsville hot zone. The Gainsville traffic control branch implements the traffic
control to facilitate the evacuation of the Gainsville hot zone. The Gainsville command
establishes rapid intervention task forces to respond if the propane tank explodes.
One difficulty in generating such a plan, apart from the fact that it involves many
actions, is that the actions from the preceding groups are actually performed in parallel.
The goal of Disciple-VPT is to provide a capability for rapid and low-cost development of
virtual planning experts to be used in this type of multidomain collaborative planning.
Moreover, the plans generated by the system should be more comprehensive than those
produced by a collaborative team of humans and should be generated much faster and
cheaper than currently possible. The next section introduces the Disciple-VE learning
agent shell, which is the basis of Disciple-VPT.
13:54:21,
.013
12.5. Disciple-VPT 391
13:54:21,
.013
object
instance of damaged
container
major fire emergency gas propane liquid propane
has as part
13:54:21,
subconcept of
fire instance of fill pipe1 ...
has as content tank
instance of
gas propane f1 is connected to
subconcept of
Gainsville incident instance of
is fueled by propane tank
is located in is situated in
is caused by
... instance of fire engine water tank
Figure 12.45. Fragment of the object ontology from the emergency planning area.
12.5. Disciple-VPT 393
13:54:21,
.013
394 Chapter 12. Disciple Agents
The reasoning rules of Disciple-VE are expressed with the elements of the object
ontology. Reduction rules indicate how general planning or inference tasks can be reduced
to simpler tasks, actions, or solutions. Synthesis rules indicate how solutions of simpler tasks
can be combined into solutions of complex tasks, or how actions can be combined into
partially ordered plans for more complex tasks.
The next section introduces the type of hierarchical task network planning performed
by Disciple-VE and the associated elements that are represented into its knowledge base.
Question 1/Answer 1
Preconditions 2 Preconditions 3
Planning Planning
S1 Goal 2 S3 S3 Goal 3 S4
Task 2 Task 3
13:54:21,
.013
12.5. Disciple-VPT 395
that there is no order relation between Planning Task 2 and Planning Task 3. This means
that these tasks may be performed in parallel or in any order. Stating that Planning
Task 3 is performed after Planning Task 2 would mean that any subtask or action of
Planning Task 3 has to be performed after all subtasks and actions of Planning Task 2.
Formulating such order relations between the tasks significantly increases the efficiency of
the planning process because it reduces the number of partial orders that it has to
consider. On the other hand, it also reduces the number of generated plans if the tasks
should not be ordered.
Planning takes place in a given world state. A world state is represented by all the objects
present in the world together with their properties and relationships at a given moment of
time. For instance, the bottom part of Figure 12.45 shows a partial representation of a
world state where fire1, which is situated in fill pipe1, is impinging on propane tank1. As will
be discussed in more detail in Section 12.5.10, each world state is represented by Disciple-
VE as a temporary state knowledge base.
The states are changed by the performance of actions. Abstract representations of
actions are shown at the bottom of Figure 12.47. An action is characterized by name,
preconditions, delete effects, add effects, resources, and duration. An action can be per-
formed in a given world state Si if the action’s preconditions are satisfied in that state. The
action’s execution has a duration and requires the use of certain resources. The resources are
objects from the state Si that are uniquely used by this action during its execution. This
means that any other action that would need some of these resources cannot be executed in
parallel with it. As a result of the action’s execution, the state Si changes into the state Sj, as
specified by the action’s effects. The delete effects indicate what facts from the initial state Si
are no longer true in the final state Sj. The add effects indicate what new facts become true in
the final state Sj.
An action from the emergency planning area is shown in Figure 12.48. The action’s
preconditions, name, delete, and add effects are represented as natural language phrases
that contain instances, concepts, and constants from the agent’s ontology. The action’s
duration can be a constant, as in this example, or a function of the other instances from the
action’s description. Resources are represented as a list of instances from the ontology.
The starting time is computed by the planner.
13:54:21,
.013
396 Chapter 12. Disciple Agents
A goal is a representation of a partial world state. It specifies what facts should be true
in a world state so that the goal is achieved. As such, a goal may be achieved in several
world states.
A task is characterized by name, preconditions, and goal. A task is considered for
execution in a given world state if its preconditions are satisfied in that state. Successful
execution of the task leads to a new world state in which the task’s goal is achieved. Unlike
actions, tasks are not executed directly, but are first reduced to actions that are executed.
Figure 12.49 shows a task reduction tree in the interface of the Reasoning Hierarchy
Browser of Disciple-VE. The initial task, “Respond to the Gainsville incident,” is reduced to
five subtasks. The second of these subtasks (which is outlined in the figure) is successively
reduced to simpler subtasks and actions.
Figure 12.50 shows the reduction of the initial task in the interface of the Reasoning
Step Editor, which displays more details about each task. As in the case of an action, the
task’s name, preconditions, and goal are represented as natural language phrases that
include instances and concepts from the agent’s ontology. Notice that none of the visible
“After” boxes is checked, which means that Sub-task (1), Sub-task (2), and Sub-task (3) are
not ordered.
The single most difficult agent training activity for the subject matter expert is to make
explicit how he or she solves problems by using the task reduction paradigm, an activity that
we call modeling an expert’s reasoning. To cope with this problem, we have developed an
intuitive modeling language, a set of modeling guidelines, and a set of modeling modules
that help the subject matter experts to express their reasoning (Bowman, 2002). However,
planning introduces additional complexities related to reasoning with different world
states and with new types of knowledge elements, such as preconditions, effects, and
goals. For these reasons, and to facilitate agent teaching by a subject matter expert, we
have extended both the modeling approach of Disciple and the classical HTN planning
paradigm (Ghallab et al., 2004), as discussed in the next section.
13:54:21,
.013
.013
13:54:21,
Figure 12.49. An example of a task reduction tree in the Reasoning Hierarchy Browser.
398 Chapter 12. Disciple Agents
Figure 12.50. An example of a task reduction step in the Reasoning Step Editor.
13:54:21,
.013
12.5. Disciple-VPT 399
When showing a planning example to the agent, the expert does not need to consider
all the previously discussed possible ordering relations, which would be very difficult in
the case of the complex problems addressed by Disciple-VE. Instead, the expert has to
consider one possible order, such as the one in which Planning Task 3 is performed after
Action 2, in state S3. This allows both the expert and the agent to have a precise
understanding of the state in which each task is performed, which is necessary in order
to check that its preconditions are satisfied. Thus, when specifying a decomposition of a
task into subtasks and/or actions, the expert has to describe the subtasks and actions in a
plausible order, even though they can also be performed in a different order.
13:54:21,
.013
400 Chapter 12. Disciple Agents
reduction of Abstract Task 2 to Planning Task 2, a reduction performed in state S1. Precondition
2 represents the facts from the state S1 that are required in order to make this reduction.
After that, the expert continues with the reduction of Planning Task 2 to Action 1 and
Action 2. Thus Planning Task 2 is actually performed by executing Action 1 and Action 2,
which changes the world state from S1 to S3. At this point, the expert can specify the goal
achieved by Planning Task 2. This goal is an expression that depends on the effects of
Action 1 and Action 2, but is also unique for Task 2, which is now completely specified.
Next the expert can continue with planning for Abstract Task 3 in the state S3.
Guideline 12.4. Specify the goal of the current task to enable the
specification of the follow-on tasks
The goal of a task represents the result obtained if the task is successfully performed. The
main purpose of the goal is to identify those instances or facts that have been added by the
task’s component actions and are needed by its follow-on tasks or actions. Thus, specify
this goal to include these instances or facts.
To illustrate this guideline, let us consider the Sub-task (2) pane in Figure 12.50. Notice
that the two instances from the “Goal” part (“Gainsville command” and “Gainsville ICS”) are
used in the follow-on expressions of the reduction from Figure 12.50.
The Reasoning Hierarchy Browser, shown in Figure 12.49, and the Reasoning Step
Editor, shown in Figure 12.50, support the modeling process. The Reasoning Hierarchy
Browser provides operations to browse the planning tree under development, such as
expanding or collapsing it step by step or in its entirety. It also provides the expert with
macro editing operations, such as deleting an entire subtree or copying a subtree and
pasting it under a different task. Each reduction step of the planning tree is defined by
using the Reasoning Step Editor, which includes several editors for specifying the com-
ponents of a task reduction step. It has completion capabilities that allow easy identifica-
tion of the names from the object ontology. It also facilitates the viewing of the instances
and concepts from the expressions being edited by invoking various ontology viewers.
An important contribution of Disciple-VE is the ability to combine HTN planning with
inference, as described in the following section.
Inference: Gainsville command evaluates the situation created by the Gainsville incident.
Inference: Gainsville command determines the incident action plan for overpressure
situation with danger of BLEVE.
13:54:21,
.013
12.5. Disciple-VPT 401
The first of these inference actions has as result “overpressure situation with danger of
BLEVE in propane tank1 caused by fire1.” BLEVE is the acronym for boiling liquid
expanding vapors explosion.
From the perspective of the planning process, an inference action simulates a complex
inference process by representing the result of that process as the add effect of the inference
action. An inference action is automatically reduced to an inference task. The inference
task is performed in a given world state to infer new facts about that state. These facts are
represented as the add effects of the corresponding inference action and added into the
world state in which the inference action is performed.
The inference process associated with an inference task is also performed by using the
task reduction paradigm, but it is much simpler than the planning process because all the
reductions take place in the same world state. An abstract example of an inference tree is
shown in Figure 12.51.
An inference task is performed by successively reducing it to simpler inference tasks, until
the tasks are simple enough to find their solutions. Then the solutions of the simplest tasks are
successively combined, from the bottom up, until the solution of the initial task is obtained.
This task reduction and solution synthesis process is also guided by questions and
answers, similarly to the planning process (and as discussed in Section 4.2). Figure 12.52
shows the top part of the inference tree corresponding to the following inference task:
Inference: Gainsville command determines the incident action plan for overpressure
situation with danger of BLEVE.
Determine what can be done to prevent the overpressure situation with danger of
BLEVE to evolve in a BLEVE1.
Determine how to reduce the effects in case the overpressure situation with danger
of BLEVE does evolve in a BLEVE1.
The first of these subtasks is successively reduced to simpler and simpler subtasks, guided
by questions and answers, as shown in Figure 12.52.
Notice that an inference tree no longer needs to use elements such as abstract tasks,
preconditions, actions, effects, resources, or duration. The teaching process is also much
simpler than in the case of the planning process. Therefore, we will first present how the
expert can teach Disciple-VE to perform inference tasks. Then we will present how the
Question/ Question/
Answer Answer
13:54:21,
.013
Reducon
Step
.013
13:54:21,
teaching and learning methods for inference tasks have been extended to allow the expert
also to teach the agent how to perform planning tasks.
13:54:21,
.013
404 Chapter 12. Disciple Agents
Figure 12.54. The rule learned from the example in Figure 12.53.
the Disciple-VE agent (which is able to generalize the task reduction example and its
explanation into a general rule, by using the object ontology as a generalization lan-
guage). The learning method is the one in Table 9.2 and will be illustrated in the
following.
The first step of the learning method is Mixed-Initiative Understanding (explanation
generation). The question and its answer from the task reduction step represent the
expert’s reason (or explanation) for performing that reduction. Therefore, understanding
the example by Disciple-VE means understanding the meaning of the question/answer
pair in terms of the concepts and features from the agent’s ontology. This process is
difficult for a learning agent that does not have much knowledge because the experts
express themselves informally, using natural language and common sense, and often omit
essential details that they consider obvious. The question/answer pair from the example in
Figure 12.53 is:
13:54:21,
.013
12.5. Disciple-VPT 405
13:54:21,
.013
406 Chapter 12. Disciple Agents
IF the task is
Determine whether we can prevent the ?O1
by extinguishing ?O2
THEN
Determine whether we can prevent the ?O1
by extinguishing ?O2 when we cannot turn
?O3 off from ?O2
Figure 12.56. Specific inference rule covering only the initial example.
Applying a similar procedure to each variable value from the condition in Figure 12.56 one
obtains the plausible upper bound condition shown in Figure 12.57.
The plausible lower bound condition is obtained by replacing each variable value with
its minimal generalization that is not an instance, based on the object ontology. The
procedure is similar to the one for obtaining the plausible upper bound condition.
Therefore:
The reason the lower bound cannot contain any instance is that the learned rule will be
used by Disciple-VE in other scenarios where the instances from the current scenario
(such as fire1) do not exist, and Disciple-VE would not know how to generalize them. On
the other hand, we also do not claim that the concept to be learned is more general than
the lower bound.
Notice that the features from the explanation of the example significantly limit the size
of the initial plausible version space condition and thus speed up the rule-learning
process. This is a type of explanation-based learning (DeJong and Mooney, 1986; Mitchell
13:54:21,
.013
12.5. Disciple-VPT 407
generalization
generalization
Maximal
Minimal
?O1 is BLEVE1
?O2 is fire1
is fueled by ?O3
is situated in ?O5
?O3 is gas propane f1
?O4 is shutoff valve1
has as operating status ?S1
?O5 is fill pipe1
has as part ?O4
?S1 is damaged
et al., 1986) except that the knowledge base of Disciple-VE is incomplete and therefore
rule learning requires additional examples and interaction with the expert.
After the rule was generated, Disciple-VE analyzes it to determine whether it was
learned from an incomplete explanation (Boicu et al., 2005). To illustrate, let us consider
again the process of understanding the meaning of the question/answer pair from
Figure 12.53, in terms of the concepts and features from the agent’s ontology. In the
preceding, we have assumed that this process has led to the uncovering of implicit
explanation pieces. However, this does not always happen. Therefore, let us now assume
that, instead of the more complete explanation pieces considered in the preceding, the
identified explanation pieces of the example are only those from Figure 12.58. In this case,
the learned rule is the one from Figure 12.59.
13:54:21,
.013
408 Chapter 12. Disciple Agents
Figure 12.59. Rule learned from the example in Figure 12.53 and the explanation in
Figure 12.58.
The variables from the IF task of a rule are called input variables because they are
instantiated when the rule is invoked in problem solving. The other variables of the rule are
called output variables.
During the problem-solving process, the output variables are instantiated by the agent
with specific values that satisfy the rule’s applicability condition. In a well-formed rule, the
output variables need to be linked through explanation pieces to some of the input
variables of the rule. Therefore, one rule analysis method consists of determining whether
there is any output variable that is not constrained by the input variables. For instance, in
the case of the rule from Figure 12.59, Disciple-VE determined that the variables ?O3, ?O4,
and ?S1 are not constrained and asks the expert to guide it to identify additional explan-
ation pieces related to their corresponding values (i.e., gas propane f1, shutoff valve1,
and damaged).
If the rule passes the structural analysis test, Disciple-VE determines the number of its
instances in the current knowledge base and considers that the rule is incompletely
learned if this number is greater than a predefined threshold. In such a case, the agent
will attempt to identify which variables are the least constrained and will attempt to
constrain them further by interacting with the expert to find additional explanation pieces.
13:54:21,
.013
12.5. Disciple-VPT 409
Following such a process, Disciple-VE succeeds in learning a reasonable good rule from
only one example and its explanation, a rule that may be used by Disciple-VE in the planning
process. The plausible upper bound condition of the rule allows it to apply to situations that
are analogous with the one from which the rule was learned. If the expert judges this
application to be correct, then this represents a new positive example of the rule, and the
plausible lower bound condition is generalized to cover it, as discussed in Section 10.1.3.1.
Otherwise, the agent will interact with the expert to find an explanation of why the
application is incorrect and will specialize the rule’s conditions appropriately, as discussed
in Section 10.1.4. Rule refinement could lead to a complex task reduction rule, with
Except-When conditions that should not be satisfied in order for the rule to be applicable.
13:54:21,
.013
Inference Step Example Inference Rule
Si Inference Task 1 Inference Task 1g
Synthesis
S1 Planning Task 2 Goal 2 S3 Question 21g/Answer 21g Condition 21g Rule
Goal 2g
13:54:21,
These rules will not be learned all at once, but in the sequence indicated in Figure 12.62.
This sequence corresponds to the sequence of modeling operations for the subtree of
Planning Task 1, as discussed in Section 12.5.6.
First the expert asks himself or herself a question related to how to reduce Planning
Task 1. The answer guides the expert to reduce this task to two abstract tasks. From this
reduction, the agent learns a planning task reduction rule (see Figure 12.62a), by using the
method described in Section 12.5.9.4. Next the expert reduces Abstract Task 2 to Planning
Task 2, and the agent learns a task concretion rule (see Figure 12.62b) by using the method
described in Section 12.5.9.5. After that, the expert continues specifying the reduction tree
corresponding to Planning Task 2, and the agent learns rules from the specified planning
step, as indicated previously. During the development of this planning tree, the agent
may apply the preceding rules, if their conditions are satisfied, and may refine them based
on the expert’s feedback. After the entire subtree corresponding to Planning Task 2 is
Question/Answer
Preconditions 2 Preconditions 3
Planning Planning
S1 Goal 2 S3 S3 Goal 3 S4
Task 2 Task 3
Question/Answer Question/Answer
Concretion Rule
Effects Effects Effects Effects Effects Effects
Resources Duration Resources Duration Resources Duration
Concretion Example
Abstract Task 3
Abstract Task 3
Precondition 3g
(d)
Precondition 3 Condition 3g
S3
Planning Task 3
Planning Task 3g
Goal Example
Planning Task 3 Goal 3 S4
Goal Mapping Rule
(e)
S1 Planning Task 1 Goal 1 S4
Question/Answer
Goal 3g
Abstract Task 2 Abstract Task 3
Preconditions 2 Preconditions 3
Planning Planning
S1 Goal 2 S3 S3 Goal 3 S4
Task 2 Task 3
Question/Answer Question/Answer
13:54:21,
.013
12.5. Disciple-VPT 413
developed, the agent can learn the goal mapping rule corresponding to Goal 2, as
described in Section 12.5.9.4. The learning of the concretion rule for Abstract Task
3 and of the goal mapping rule for Goal 3 is done as described previously. After that,
Disciple-VE learns the goal synthesis rule corresponding to Goal 1, as described in
Section 12.5.9.4.
The preceding illustration corresponds to a reduction of a planning task into planning
subtasks. However, a planning task can also be reduced to elementary actions, as
illustrated at the bottom part of Figure 12.61. In this case, Disciple-VE will learn more
complex action concretion rules instead of task concretion rules, as discussed in
Section 12.5.9.6. In the following sections, we will present the aforementioned learning
methods.
GIVEN
A sequence of reduction and synthesis steps called SE that indicate how a specific planning task
is reduced to its immediate specific subtasks and/or actions and how its goal is synthesized from
their goals/effects.
A knowledge base that includes an ontology and a set of rules.
A subject matter expert who understands why the given planning steps are correct and may
answer the agent’s questions.
DETERMINE
A set of reduction, concretion, goal, and/or action rules called SR that share a common space of
variables, each rule being a generalization of an example step from SE.
An extended ontology (if needed for example understanding)
13:54:21,
.013
414 Chapter 12. Disciple Agents
Let SE be a sequence of reduction and synthesis steps that indicate how a specific planning task T is
reduced to its immediate specific subtasks and/or actions, and how its goal is synthesized from
their goals/effects.
1. Initialize the set V of shared variables and their values in SE: V Ф
2. Learn a planning task reduction rule from the reduction of T to the abstract tasks ATi and
update the set V (by using the method described in Table 12.7 and Section 12.5.9.4).
3. For each abstract task ATi do
If ATi is reduced to a concrete Task Ti
Then 3.1. Learn a planning task concretion rule and update set V (using the
method from Section 12.5.9.5).
3.2. Develop the entire subtree of Ti (this may lead to the learning of new rules by
using the methods from Tables 9.2 and 12.6).
3.3. Learn the goal mapping rule for Ti (using the method from Section 12.5.9.4).
Else if ATi is reduced to an elementary action Ai
Then 3.1. Learn an action concretion rule and update the set V (using the method from
Section 12.5.9.6).
4. Learn the goal synthesis rule for T (by using the method described in Section 12.5.9.4).
As part of example understanding, Disciple-VE will interact with the expert to find the
following explanation pieces, which represent an approximation of the meaning of the
question/answer pair in the current world state:
Continuing with the steps from Table 12.7, Disciple-VE will learn the rule from the
left-hand side of the pane in Figure 12.63.
The final list of shared variables is shown in the right-hand side of this pane. The right-
hand side of the pane shows also the goal produced by the goal synthesis rule. This rule
generalizes the expression representing the goal associated with the IF task by replacing its
instances and constants with the corresponding variables from the list of shared variables.
Similarly, the goal mapping rule generalizes the goals of the THEN tasks.
13:54:21,
.013
12.5. Disciple-VPT 415
Table 12.7 The Learning Method for a Correlated Planning Task Reduction Rule
Let E be a reduction of a specific planning task T to one or several abstract tasks ATi, reduction
taking place in state Sk, and let V be the set of shared variables and their values.
(1) Mixed-Initiative Understanding (Explanation Generation)
Determine the meaning of the question/answer pair from the example E, in the context of the
agent’s ontology from the state Sk, through mixed-initiative interaction with the subject matter
expert. This represents a formal explanation EX of why the example E is correct. During this
process, new objects and features may be elicited from the expert and added to the ontology. This
is done in order to better represent the meaning of the question/answer pair in terms of the
objects and features from the ontology.
(2) Example Reformulation
Generate a variable for each instance and each constant (i.e., number, string, or symbolic
probability) that appears in the example E and its explanation EX. Then use these variables to
create an instance I of the concept C representing the applicability condition of the rule R to be
learned. C is the concept to be learned as part of rule learning and refinement. Finally, reformulate
the example as a very specific IF-THEN rule with I as its applicability condition. The elements of the
rule are obtained by replacing each instance or constant from the example E with the
corresponding variable.
(3) Updating of Shared Variables and Values
Add to the set V the new variables and their values from the condition C.
(4) Analogy-based Generalizations
Generate the plausible upper bound condition of the rule R as the maximal generalization of I in
the context of the agent’s ontology.
Generate the plausible lower bound condition of the rule R as the minimal generalization of I that
does not contain any specific instance.
(5) Rule Analysis
If there is any variable from the THEN part of a rule that is not linked to some variable from the IF
part of the rule, or if the rule has too many instances in the knowledge base, then interact with the
expert to extend the explanation of the example and update the rule if new explanation pieces are
found. Otherwise, end the rule-learning process.
Gainsville incident has as ICS Gainsville ICS has as ICS unified command Gainsville command
The Rule Analysis step takes the value of the set V into account to determine the unlinked
output variables. In particular, an output variable from the concrete task does not need to
be linked to input variables if it is part of the input value of V.
13:54:21,
.013
416 Chapter 12. Disciple Agents
Figure 12.63. Planning reduction rule learned from the reduction in Figure 12.49.
13:54:21,
.013
12.5. Disciple-VPT 417
Figure 12.65. The planning concretion rule learned from the example in Figure 12.64.
Figure 12.66. The correlated action concretion rule learned from the example in Figure 12.48.
13:54:21,
.013
418 Chapter 12. Disciple Agents
KB-0
KB-12 KB-34
KB-D1 KB-D3
KB-D2
KB-D4
VE KB-B1 VE KB-A1 VE KB-B3 VE KB-I3
VE KB-I1
VE KB-B2 VE KB-I2 VE KB-A2 VE KB-B4
13:54:21,
.013
12.5. Disciple-VPT 419
Figure 12.68. The interface of the Object Browser and Object Viewer.
To allow the KBs from the hierarchy to be updated and extended separately, the
Disciple-VPT system maintains multiple versions for each KB. Let us assume that each
KB from Figure 12.67 has the version 1.0. Let us further assume that the management team
for KB-0 decides to make some changes to this KB that contains units of measure. For
instance, the team decides to include the metric units, to rename “gallon” as “US gallon,”
and to add “UK gallon.” As a result, the team creates version 2.0 of KB-0. However, the
other knowledge bases from the library (e.g., KB-12) still refer to version 1.0 of KB-0. The
management team for KB-12 is informed that a higher version of KB-0 is available. At this
point, the team can decide whether it wants to create a new version of KB-12 that inherits
knowledge from version 2.0 of KB-0. The KB update process uses the KB updating tool of
Disciple-VE. This tool creates version 2.0 of KB-12 by importing the knowledge from
version 1.0 of KB-12, in the context of version 2.0 of KB-0. Even though the version 2.0
of KB-12 has been created, Disciple-VPT still maintains KB-0 version 1.0 and KB-12
version 1.0, because these versions are used by KB-D1 version 1.0 and by other KBs from
the repository. The management team for KB-D1 may now decide whether it wants to
upgrade KB-D1 to the new versions of its upper-level KBs, and so on. Because of the
version system, each KB from the library maintains, in addition to its version, the versions
of the other KBs from which it inherits knowledge.
Another important knowledge management functionality offered by Disciple-VPT is
that of splitting a KB into two parts, a more general one and a more specific one. This
allows a KB developer first to build a large KB and then to split it and create a hierarchy
of KBs.
13:54:21,
.013
420 Chapter 12. Disciple Agents
When a virtual expert is extracted from the VE Library and introduced into a VE Team
(see Figure 12.44), all the KBs from which it inherits knowledge are merged into a shared
KB in order to increase the performance of the agent. Let us consider the Intermediate
agent from the domain D3 (see Figure 12.67). In this case, KB-D3, KB-12, KB-34, and KB-0
are all merged into the Shared KB of this agent. As a consequence, the structure of the KBs
of this agent during planning is the one from Figure 12.69. Notice that, in addition to the
Shared KB, there are three other types of KBs, Domain KB, Scenario KB, and State KB, all
hierarchically organized. Domain KB is the KB of this Intermediate agent from the domain
D3, knowledge independent of any particular scenario. Each scenario is represented into a
different KB called Scenario KB. For example, there would be a Scenario KB for the
propane tank fire scenario described in Section 12.5.3, and a different Scenario KB for a
scenario involving red-fuming nitric acid spilling from a truck parked near a residential
area (Tecuci et al., 2008c). Moreover, under each scenario KB there is a hierarchy of State
KBs. KB-S1 represents the state obtained from SKB-M after the execution of an action
which had delete and/or add effects. As additional actions are simulated during planning,
their delete and add effects change the state of the world. KB-S11, KB-S12, and KB-S13 are
the states corresponding to three alternative actions. The entire description of the state
corresponding to KB-S11 is obtained by considering the delete and add effects in the states
KB-S11 and KB-S1, and the facts in the scenario SKB-M.
KB-0
KB-D3
Scenario KBs
State KBs
KB-S1
13:54:21,
.013
12.5. Disciple-VPT 421
D1
.Ti
.Tm
D3 D2
In this illustration, the planning task Tm belongs only to D2 and can be performed only by a
virtual expert from that domain. Ti is a task common to D1 and D2 and can, in principle, be
performed either by a virtual expert in D1 or by a virtual expert in D2. In general, a virtual
expert will cover only a part of a given expertise domain, depending on its level of
expertise. For instance, the virtual expert library illustrated in Figure 12.67 includes three
virtual experts from the domain D1, a basic one, an intermediate one, and an advanced
one, each covering an increasingly larger portion of the domain. Therefore, whether a
specific virtual expert from the domain D2 can generate a plan for Tm and the quality of the
generated plan depend on the expert’s level of expertise.
A virtual expert has partial knowledge about its ability to generate plans for a given task,
knowledge that is improved through learning. For instance, the virtual expert knows that it
may be able to generate plans for a given task instantiation because that task belongs to its
expertise domain, or because it was able to solve other instantiations of that task in the
past. Similarly, it knows when a task does not belong to its area of expertise. The virtual
experts, however, do not have predefined knowledge about the problem-solving capabil-
ities of the other experts from a VE Team or the VE Library. This is a very important feature
of Disciple-VPT that facilitates the addition of new agents to the library, or the improve-
ment of the existing agents, because this will not require taking into account the know-
ledge of the other agents.
The task reduction paradigm facilitates the development of plans by cooperating virtual
experts, where plans corresponding to different subtasks of a complex task may be
generated by different agents. This multi-agent planning process is driven by an auction
mechanism that may apply several strategies. For instance, the agents can compete for
solving the current task based on their prior knowledge about their ability to solve that
task. Alternatively, the agents may actually attempt to solve the task before they bid on it.
13:54:21,
.013
422 Chapter 12. Disciple Agents
13:54:21,
.013
Start Dura-
Id Action Result Resources
Time tion
Suburbane Propane Company facility manager, a person, Add: Suburbane Propane Company facility manager arrived at the 1.0 min
1 Suburbane Propane Company facility manager 0.0 s
arrives at the scene of the Gainsville incident scene and is available to take required actions 0.0 s
ALS unit M504, an ALS unit, arrives at the scene of the Add: ALS unit M504 arrived at the scene and is available to take 5.0 min
2 ALS unit M504, paramedic 504a, and paramedic 504b 0.0 s
Gainsville incident required actions 0.0 s
fire engine driver E504, fire fighter E504b, fire engine
fire engine company E504, a fire engine company, arrives Add: fire engine company E504 arrived at the scene and is available company E504, fire engine E504, deluge nozzle E504, 5.0 min
3 0.0 s
at the scene of the Gainsville incident to take required actions water hose E504, fire officer E504, and fire fighter 0.0 s
E504a
… … … … …
Delete: fire officer E504 is no longer available
fire officer E504 assumes the command of the incident Add: Gainsville ICS, the incident command system for the
5.0 min
48 command system for the Gainsville incident, as fire Gainsville incident, is created and fire officer E504 assumes ICS fire officer E504 15.0 s
0.0 s
department representative in the ICS unified command command as fire department representative in the Gainsville
command
Gainsville command evaluates the situation created by the Add: overpressure situation with danger of BLEVE in propane 5.0 min
49 Gainsville command 30.0 s
Gainsville incident tank1 is caused by fire1 15.0 s
.013
fire engine company E525 establishes continuous water fire fighter E525a, fire fighter E525b, fire officer E525, 14.0 min 7.0 min
63 Add: fire hydrant1 is assigned to fire engine company E504
supply from fire hydrant1 for fire engine company E504 fire engine driver E525, and fire engine company E525 30.0 s 0.0 s
… … … … …
Generated Plan SA A N D SD
The objectives of the actions are clear from the generated plan. 3
Usability of Disciple-VPT SA A N D SD
First, we have extended the Disciple approach to allow the development of complex
HTN planning agents that can be taught their planning knowledge, rather than having it
defined by a knowledge engineer. This is a new and very powerful capability that is not
present in other action planning systems (Tate, 1977; Allen et al., 1990; Nau et al., 2003;
13:54:21,
.013
12.5. Disciple-VPT 425
Ghallab et al., 2004). This capability was made possible by several major developments
of the Disciple approach. For instance, we have significantly extended the knowledge
representation and management of a Disciple agent by introducing new types of know-
ledge that are characteristic of planning systems, such as planning tasks and actions
(with preconditions, effects, goal, duration, and resources) and new types of rules
(e.g., planning tasks reduction rules, concretion rules, action rules, goal synthesis rules).
We have introduced state knowledge bases and have developed the ability to manage
the evolution of the states in planning. We have developed a modeling language and
a set of guidelines that help subject matter experts express their planning process.
We have developed an integrated set of learning methods for planning, allowing the
agent to learn general planning knowledge starting from a single planning example
formulated by the expert.
A second result is the development of an integrated approach to planning and infer-
ence, both processes being based on the task reduction paradigm. This improves the
power of the planning systems that can now include complex inference trees. It also
improves the efficiency of the planning process because some of the planning operations
can be performed as part of a much more efficient inference process that does not require
a simulation of the change of the state of the world.
A third result is the development and implementation of the concept of library of virtual
experts. This required the development of methods for the management of a hierarchical
knowledge repository. The hierarchical organization of the knowledge bases of the virtual
experts also serves as a knowledge repository that speeds up the development of new
virtual experts that can reuse the knowledge bases from the upper levels of this hierarchy.
A fourth result is the development of the multidomain architecture of Disciple-VPT,
which extends the applicability of the expert systems to problems whose solutions require
knowledge of more than one domain.
A fifth result is the development of two basic virtual experts, a basic fire expert and a
basic emergency management expert, that can collaborate to develop plans of actions that
are beyond their individual capabilities.
Finally, a sixth result is the development of an approach and system that has high
potential for supporting a wide range of training and planning activities.
13:54:21,
.013
Design Principles for Cognitive
13 Assistants
This book has presented an advanced approach to developing personal cognitive assist-
ants. Although the emphasis in this book has been on cognitive assistants for evidence-
based hypothesis analysis, the Disciple approach is also applicable to other types of tasks,
as was illustrated by the agents presented in Chapter 12. Moreover, the Disciple approach
illustrates the application of several design principles that are useful in the development of
cognitive assistants in general. In this chapter, we review these principles, which have
been illustrated throughout this book. Each of the following sections starts with the
formulation of a principle and continues with its illustration by referring back to previous
sections of the book.
Employ learning technology to simplify and automate the knowledge engineering process.
426
13:56:23,
.014
13.4. Knowledge Base Structuring 427
the rules in problem solving and the expert critiques the reasoning process which, in turn,
guides the agent in refining the rules.
Use a problem-solving paradigm that is both natural for the human user and appropriate for the
automated agent.
Use a problem-solving paradigm for the agent that facilitates both collaboration between users
assisted by their agents and the solving of problems requiring multidomain expertise.
Many existing or potential applications of cognitive assistants are cross-domain, requiring the
collaboration not only between a user and his or her assistant, but also among several users,
each with his or her own area of expertise, as illustrated by the emergency response planning
domain addressed by Disciple-VPT (see Section 12.5). The problem reduction strategy
employed by the Disciple agents can reduce a multidisciplinary problem to subproblems
that may be solved by different experts and their agents. Then, the domain-specific solutions
found by individual users may be combined to produce the solution of the multidisciplinary
problem. With such an approach, each agent supports its user, not only in problem solving,
but also in collaboration and sharing of information with the other users.
Knowledge base development is a very complex activity, and knowledge reuse can signifi-
cantly simplify it. Moreover, knowledge reuse facilitates the communication with other
agents that share the same knowledge.
13:56:23,
.014
428 Chapter 13. Design Principles for PLAs
Disciple agents facilitate the reuse of knowledge through two types of knowledge struc-
turing. First, the agent’s knowledge is structured into an ontology that defines the concepts
of the application domain and a set of problem-solving rules expressed with these concepts.
The ontology is the more general part, being applicable to an entire domain. Therefore,
when developing a new application, parts of the ontology can be reused from the previously
developed applications in that domain.
A second type of knowledge structuring is the organization of the knowledge repository
of the agent as a three-level hierarchy of knowledge bases, as discussed in Section 3.3.5
(see Figure 3.17, p. 104). The top of the knowledge repository is the Shared KB, which
contains general knowledge for evidence-based reasoning applicable in all the domains.
Under the Shared KB are Domain KBs, each corresponding to a different application
domain. Finally, under each Domain KB are Scenario KBs, each corresponding to a
different scenario. Therefore, when developing the KB for a new scenario, the agent reuses
the corresponding Domain KB and the Shared KB. Similarly, when developing a new
Domain KB, the agent reuses the Shared KB.
Use agent teaching and learning methods where the user helps the agent to learn and the agent
helps the user to teach it.
Learning the elements of the knowledge base is a very complex process. In the Disciple
approach, this process is simplified by a synergistic integration of teaching and learning, as
discussed in Chapters 9 and 10. In particular, the user helps the agent to learn by providing
representative examples of reasoning steps, as well as hints that guide the agent in
understanding these examples. But the agent also helps the user to teach it by presenting
attempted solutions to problems for the user to critique, as well as attempted explanations
of an example, from which the user can select the correct ones.
Use multistrategy learning methods that integrate complementary learning strategies in order to
take advantage of their strengths to compensate for each other’s weaknesses.
No single learning strategy is powerful enough to learn the complex reasoning rules
needed by the agent, but their synergistic combination is, as illustrated by the methods
employed by the Disciple agents. In particular, a Disciple agent employs learning from
explanations and analogy-based generalization to generate a plausible version space rule
from a single problem-solving example and its explanation (see Section 9.4, p. 258, and
Figure 9.9, p. 259). It then employs learning by analogy and experimentation to generate
additional examples of the learned rules that are to be critiqued by the user, and it
employs empirical induction from examples, as well as learning from failure explan-
ations, to refine the rule based on the new examples and explanations (see Section 10.1.2
and Figure 10.1, p. 295). This results in a powerful method that enables the agent to learn
very complex rules from only a few examples and their explanations, obtained in a
natural dialogue with the user.
13:56:23,
.014
13.8. Modeling, Learning, and Problem Solving 429
Due to the complexity of the real world and to its dynamic nature, an agent’s knowledge
elements will always be approximate representations of real-world entities. Therefore,
improving and even maintaining the utility of a cognitive assistant depends on its capacity
to adapt its knowledge continuously to better represent the application domain.
The rule-learning methods of the Disciple agents continually improve the rules based
on their failures and successes. But these improvements are done in the context of an
existing ontology, which may itself evolve. Therefore, when the ontology undergoes
significant changes, the previously learned rules need to be relearned. This process can
be done automatically, if each rule maintains minimal generalizations of the examples and
the explanations from which it was learned, as was discussed in Section 10.2.
Use mixed-initiative methods where modeling, learning, and problem solving mutually support
each other to capture the expert’s tacit knowledge.
Figure 13.1 illustrates the synergistic integration of modeling, learning, and problem
solving as part of the reasoning of a Disciple agent. These activities mutually support each
other. For example, problem solving generates examples for learning to refine the rules,
and the refined rules lead to better problem solving..
Modeling provides learning with the initial expert example for learning a new rule. But
the previously learned rules or patterns also support the modeling process by suggesting
possible reductions of a problem or hypothesis, as discussed, for instance, in Section 3.3.2.
Finally, modeling advances the problem-solving process with the creative solutions
provided by the user. But problem solving also supports the modeling process by provid-
ing the context for such creative solutions, thus facilitating their definition.
Creative
solution Problem
Context for Solving
creative solution Generated
example
Mixed-
Modeling
Initiative
Reasoning Refined
rule
Expert
example Learning
Rule-based
guidance
13:56:23,
.014
430 Chapter 13. Design Principles for PLAs
Use reasoning methods that enable the use of partially learned knowledge.
Employ approaches to user tutoring that allow the agent to teach its problem-solving paradigm
easily and rapidly to its users, facilitating their collaboration.
A cognitive assistant collaborates with its user in problem solving. This requires the user
not only to understand the agent’s reasoning but also to contribute to it. A subject matter
expert teaches a Disciple agent similarly to how the expert would teach a student, through
problem-solving examples and explanations. Then, when a user employs a Disciple agent,
he or she can easily learn from its explicit reasoning, as illustrated by the Disciple-COG
agent discussed in Section 12.4. Alternatively, the agent can behave as a tutoring system,
guiding the student through a series of lessons and exercises, as illustrated by TIACRITIS
(Tecuci et al., 2010b).
Personalized learning, which was identified as one of the fourteen Grand Challenges for
Engineering in the Twenty-first Century (NAE, 2008), is a very important application of
cognitive assistants.
Employ learning agent shells that allow rapid agent prototyping, development, and customization.
As discussed in Section 3.2.3, a learning agent shell enables rapid development of an agent
because all the reasoning modules already exist. Thus one only needs to customize some
of the modules and to develop the knowledge base. For example, the customizations
performed to develop Disciple-COG consisted in developing a report generator and a
simplified interface for the problem solver.
As discussed in Section 3.2.4, a learning agent shell for evidence-based reasoning further
speeds up the development of an agent. First, the agent shell was already customized for
13:56:23,
.014
13.12. Design Based on an Agent Life Cycle 431
evidence-based reasoning. Second, part of the knowledge base is already defined in the
shell, namely the Shared KB for evidence-based reasoning (EBR KB).
Design the agent by taking into account its complete life cycle, to ensure its usefulness for as long a
period of time as possible.
This involves the incorporation of methods that support the various stages in the life cycle
of the agent. For example, Figure 13.2 illustrates a possible life cycle of a Disciple agent
that was discussed in Section 3.2.4.
The first stage is shell customization, where, based on the specification of the type of
problems to be solved and the agent to be built, the developer and the knowledge engineer
may decide that some customizations or extensions of the Disciple shell may be necessary
or useful. The next stage is agent teaching by the subject matter expert and the knowledge
engineer, supported by the agent itself, which simplifies and speeds up the knowledge
base development process. Once an operational agent is developed, it is used for training
of end-users, possibly in a classroom environment. The next stage is field use, where copies
Disciple Disciple
End-user End-user
4
Libraries Cognitive
Knowledge Repositories
Massive Databases
assistant GLOBAL
Disciple KNOWLEDGE BASE
SEARCH
AGENT
SEARCH
SEARCH
AGENT
AGENT
Disciple
Field use
Reasoning
assistance
Disciple Learning
Disciple
Collaboration Information
assistance sharing
13:56:23,
.014
432 Chapter 13. Design Principles for PLAs
of the agent support users in their operational environments. At this stage, each agent
assists its user both in solving problems and in collaborating with other users and their
cognitive assistants. At the same time, the agent continuously learns patterns from this
problem-solving experience by employing a form of nondisruptive learning. However,
because there is no learning assistance from the user, the learned patterns will not include
a formal applicability condition. It is during the next stage of after action review and
learning, when the user and the agent analyze past problem-solving episodes, that the
formal applicability conditions are learned based on the accumulated examples. In time,
each cognitive assistant extends its knowledge with additional expertise acquired from its
user. This creates the opportunity of developing a more competent agent by integrating
the knowledge of all these agents. This can be accomplished by a knowledge engineer,
with assistance from a subject matter expert, in the next stage of knowledge integration and
optimization. The result is an improved agent that may be used in a new iteration of a
spiral process of development and use.
13:56:23,
.014
References
Allemang, D., and Hendler, J. (2011). Semantic Web for the Working Ontologist: Effective Modeling in
RDFS and Owl, Morgan Kaufmann, San Mateo, CA.
Allen, J., Hendler, J., and Tate, A., (eds.) (1990). Readings in Planning, Morgan Kaufmann, San
Mateo, CA.
Anderson, T., Schum, D., and Twining, W. (2005). Analysis of Evidence, Cambridge University Press,
Cambridge, UK.
Awad, E. M. (1996). Building Expert Systems: Principles, Procedures, and Applications, West, New
York, NY.
Awad, E. M., and Ghaziri, H. M. (2004). Knowledge Management, Pearson Education International,
Prentice Hall, Upper Saddle River, NJ, pp. 60–65.
Basic Formal Ontology (BFO) (2012). Basic Formal Ontology. www.ifomis.org/bfo (accessed August
31, 2012).
Betham, J. (1810). An Introductory View of the Rationale of the Law of Evidence for Use by Non-
lawyers as Well as Lawyers (vi works 1–187 (Bowring edition, 1837–43, originally edited by James
Mill circa 1810).
Boicu, C. (2006). An Integrated Approach to Rule Refinement for Instructable Knowledge-Based Agents.
PhD Thesis in Computer Science, Learning Agents Center, Volgenau School of Information
Technology and Engineering, George Mason University, Fairfax, VA.
Boicu, C., Tecuci, G., and Boicu, M. (2005). Improving Agent Learning through Rule Analysis, in
Proceedings of the International Conference on Artificial Intelligence, ICAI-05, Las Vegas, NV, June
27–30. lac.gmu.edu/publications/data/2005/ICAI3196Boicu.pdf (accessed April 12, 2016)
Boicu, M. (2002). Modeling and Learning with Incomplete Knowledge, PhD Dissertation in
Information Technology, Learning Agents Laboratory, School of Information Technology and
Engineering, George Mason University. lac.gmu.edu/publications/2002/BoicuM_PhD_Thesis.pdf
(accessed November 25, 2015)
Boicu, M., Tecuci, G., Bowman, M., Marcu, D., Lee, S. W., and Wright, K. (1999). A Problem-Oriented
Approach to Ontology Creation and Maintenance, in Proceedings of the Sixteenth National Confer-
ence on Artificial Intelligence Workshop on Ontology Management, July 18–19, Orlando, Florida,
AAAI Press, Menlo Park, CA. lac.gmu.edu/publications/data/1999/ontology-1999.pdf (accessed
November 25, 2015)
Boicu, M., Tecuci, G., Marcu, D., Bowman, M., Shyr, P., Ciucu, F., and Levcovici, C. (2000). Disciple-
COA: From Agent Programming to Agent Teaching, in Proceedings of the Seventeenth International
Conference on Machine Learning (ICML), Stanford, CA, Morgan Kaufman, San Mateo, CA, lac.gmu
.edu/publications/data/2000/2000_il-final.pdf (accessed November 25, 2015)
Bowman, M. (2002). A Methodology for Modeling Expert Knowledge That Supports Teaching Based
Development of Agents, PhD Dissertation in Information Technology, George Mason University,
Fairfax, VA. lac.gmu.edu/publications/data/2002/Michael%20Bowman-Thesis.pdf (accessed
November 25, 2015)
Bresina, J. L., and Morris, P. H. (2007). Mixed-Initiative Planning in Space Mission Operations, AI
Magazine, vol. 28, no. 1, pp. 75–88.
433
13:56:23,
.015
434 References
Breuker, J., and Wielinga, B. (1989). Models of Expertise in Knowledge Acquisition, in Guida, G., and
Tasso, C. (eds.), Topics in Expert Systems Design, Methodologies, and Tools, North Holland,
Amsterdam, Netherlands, pp. 265–295.
Buchanan, B. G., and Feigenbaum, E. A. (1978). DENDRAL and META-DENDRAL: Their Applications
Dimensions, Artificial Intelligence, vol. 11, pp. 5–24.
Buchanan, B. G., and Shortliffe, E. H. (eds.) (1984). Rule-Based Expert Systems: The MYCIN Experi-
ments of the Stanford Heuristic Programming Project, Addison-Wesley, Reading, MA.
Buchanan, B. G., and Wilkins, D. C. (eds.) (1993). Readings in Knowledge Acquisition and Learning:
Automating the Construction and Improvement of Expert Systems, Morgan Kaufmann, San
Mateo, CA.
Buchanan, B. G., Barstow, D., Bechtal, R., Bennett, J., Clancey, W., Kulikowski, C., Mitchell, T., and
Waterman, D. A. (1983). Constructing an Expert System, in Hayes-Roth, F., Waterman, D.,
and Lenat, D. (eds.), Building Expert Systems, Addison-Wesley, Reading, MA, pp. 127–168.
Carbonell, J. G. (1983). Learning by Analogy: Formulating and Generalizing Plans from Past Experi-
ence, in Michalski, R. S., Carbonell, J. M., and Mitchell, T. M., Machine Learning: An Artificial
Intelligence Approach, Tioga, Wellsboro, PA, pp. 137–162.
Carbonell, J. G. (1986). Derivational Analogy: A Theory of Reconstructive Problem-Solving and
Expertise Acquisition, in Michalski, R. S., Carbonell, J. G., and Mitchell, T. M. (eds.), Machine
Learning: An Artificial Intelligence Approach, vol. 2, Morgan Kaufmann, San Mateo, CA,
pp. 371–392.
Chaudhri, V. K., Farquhar, A., Fikes, R., Park, P. D., and Rice, J. P. (1998). OKBC: A Programmatic
Foundation for Knowledge Base Interoperability, in Proceedings of the Fifteenth National Confer-
ence on Artificial Intelligence (AAAI-98), AAAI Press, Menlo Park, CA, pp. 600–607.
Clancey, W. (1985). Heuristic Classification, AI Journal, vol. 27, pp. 289–350.
Clausewitz, C. von (1832 [1976]). On War, translated and edited by Howard, M., and Paret, P.
Princeton University Press, Princeton, NJ.
Cohen, L. J. (1977). The Probable and the Provable, Clarendon Press, Oxford. UK.
Cohen, L. J. (1989). An Introduction to the Philosophy of Induction and Probability, Clarendon Press,
Oxford, UK.
Cohen, M. R., and Nagel, E. (1934). An Introduction to Logic and Scientific Method, Harcourt, Brace,
New York, NY, pp. 274–275.
Cohen, P., Schrag, R., Jones, E., Pease, A., Lin, A., Starr, B., Gunning, D., and Burke, M. (1998). The
DARPA High-Performance Knowledge Bases Project, AI Magazine, vol. 19, no. 4, pp. 25–49.
Cooper, T., and Wogrin, N. (1988). Rule-based Programming with OPS5, Morgan Kaufmann, San
Mateo, CA.
Cross, S. E., and Walker, E. (1994). DART: Applying Knowledge-based Planning and Scheduling to
Crisis Action Planning, in Zweben, M., and Fox, M. S. (eds.), Intelligent Scheduling, Morgan
Kaufmann, San Mateo, CA, pp. 711–729.
Cyc (2008). OpenCyc Just Got Better – Much Better! www.opencyc.org (accessed August 22, 2008).
Cyc (2016). The Cyc homepage, www.cyc.com (accessed February 3, 2016).
Dale, A. I. (2003). Most Honourable Remembrance: The Life and Work of Thomas Bayes, Springer-
Verlag, New York, NY.
David, F. N. (1962). Games, Gods, and Gambling, Griffin, London, UK.
David, P. A., and Foray, D. (2003). Economic Fundamentals of the Knowledge Society, Policy Futures
in Education. An e-Journal, vol. 1, no. 1, Special Issue: Education and the Knowledge Economy,
January, pp. 20–49.
Davies, T. R., and Russell, S. J. (1990). A Logical Approach to Reasoning by Analogy, in Shavlik, J.,
and Dietterich, T. (eds.), Readings in Machine Learning, Morgan Kaufmann, San Mateo, CA,
pp. 657–663.
DeJong, G., and Mooney, R. (1986). Explanation-based Learning: An Alternative View, Machine
Learning, vol. 1, pp. 145–176.
Department of Homeland Security (DHS) (2004). National Response Plan.
Desai, M. (2009). Persistent Stare Exploitation and Analysis System (PerSEAS), DARPA-BAA-09-55,
https://www.fbo.gov/index?s=opportunity&mode=form&id=eb5dd436ac371ce79d91c84ec4e91341
&tab=core&_cview=1 (accessed April 13, 2016)
13:56:23,
.015
References 435
DOLCE (2012). Laboratory for Applied Ontology, www.loa-cnr.it/DOLCE.html (accessed August 31,
2012)
Drucker, P. (1993). Post-Capitalist Society, HarperCollins, New York.
Durham, S. (2000). Product-Centered Approach to Information Fusion, AFOSR Forum on Information
Fusion, Arlington, VA, October 18–20.
Durkin, J. (1994). Expert Systems: Design and Development, Prentice Hall, Englewood Cliffs, NJ.
Dybala, T. (1996). Shared Expertise Model for Building Interactive Learning Agents, PhD Dissertation,
School of Information Technology and Engineering, George Mason University, Fairfax, VA.
lac.gmu.edu/publications/data/1996/Dybala-PhD-abs.pdf (accessed April 12, 2016)
Echevarria, A. J. (2003). Reining in the Center of Gravity Concept. Air & Space Power Journal, vol.
XVII, no. 2, pp. 87–96.
Eco, U., and Sebeok, T. (eds.) (1983). The Sign of Three: Dupin, Holmes, Peirce, Indiana University
Press: Bloomington.
Eikmeier, D. C. (2006). Linking Ends, Ways and Means with Center of Gravity Analysis. Carlisle
Barracks, U.S. Army War College, Carlisle, PA.
Einstein, A. (1939). Letter from Albert Einstein to President Franklin D. Roosevelt: 08/02/1939. The
letter itself is in the Franklin D. Roosevelt Library in Hyde Park, NY. See the National Archives copy
in pdf form at media.nara.gov/Public_Vaults/00762_.pdf (accessed November 16, 2014).
EXPECT (2015). The EXPECT homepage, www.isi.edu/ikcap/expect/ (accessed May 25, 2015).
Farquhar, A., Fikes, R., and Rice, J. (1997). The Ontolingua Server: A Tool for Collaborative Ontology
Construction, International Journal of Human–Computer Studies, vol. 46, no. 6, pp. 707–727.
Federal Rules of Evidence (2009). 2009–2010 ed. West Publishing, St. Paul, MN.
Feigenbaum, E. A. (1982). Knowledge Engineering for the 1980s, Research Report, Stanford Univer-
sity, Stanford, CA.
Feigenbaum, E. A. (1993). Tiger in a Cage: The Applications of Knowledge-based Systems, Invited
Talk, AAAI-93 Proceedings, www.aaai.org/Papers/AAAI/1993/AAAI93-127.pdf. (accessed April 13,
2016)
Fellbaum, C. (ed.) (1988). WordNet: An Electronic Lexical Database. MIT Press, Cambridge, MA.
FEMA (Federal Emergency Management Agency) (2007). National Incident Management System.
www.fema.gov/national-incident-management-system (accessed April 13, 2016).
Ferrucci, D., Brown, E., Chu-Caroll, J., Fan, J., Gondek, D., Kalynapur, A. A., Murdoch, J. W., Nyberg,
E., Prager, J., Schlaefer, N., and Welty, C. (2010). Building Watson: An Overview of the DeepQA
Project, AI Magazine, vol. 31, no. 3, pp. 59–79.
Filip, F. G. (1989). Creativity and Decision Support System, Studies and Researches in Computer and
Informatics, vol. 1, no. 1, pp. 41–49.
Filip, F. G. (ed.) (2001). Informational Society–Knowledge Society, Expert, Bucharest.
FM 100–5. (1993). U.S. Army Field Manual 100–5, Operations, Headquarters, Department of the
Army, Washington, DC.
FOAF (2012). The Friend of a Friend (FOAF) project. www.foaf-project.org/ (accessed August 31,
2012)
Forbes (2013). www.forbes.com/profile/michael-chipman/ (accessed September 2, 2013)
Forbus, K. D., Gentner, D., and Law, K. (1994). MAC/FAC: A Model of Similarity-Based Retrieval,
Cognitive Science, vol. 19, pp. 141–205.
Friedman-Hill, E. (2003). Jess in Action, Manning, Shelter Island, NY.
Gammack, J. G. (1987). Different Techniques and Different Aspects on Declarative Knowledge, in
Kidd, A. L. (ed.), Knowledge Acquisition for Expert Systems: A Practical Handbook, Plenum Press,
New York, NY, and London, UK.
Gentner, D. (1983). Structure Mapping: A Theoretical Framework for Analogy, Cognitive Science,
vol. 7, pp. 155–170.
Geonames (2012). GeoNames Ontology–Geo Semantic Web. [Online] www.geonames.org/ontology/
documentation.html (accessed August 31, 2012)
GFO (2012). General Formal Ontology (GFO). [Online] www.onto-med.de/ontologies/gfo/ (accessed
August 31, 2012)
Ghallab, M., Nau, D., and Traverso, P. (2004). Automatic Planning: Theory and Practice, Morgan
Kaufmann, San Mateo, CA.
13:56:23,
.015
436 References
Giarratano, J., and Riley, G. (1994). Expert Systems: Principles and Programming, PWS, Boston, MA.
Gil, Y., and Paris, C. (1995). Towards model-independent knowledge acquisition, in Tecuci, G., and
Kodratoff, Y. (eds.), Machine Learning and Knowledge Acquisition: Integrated Approaches,
Academic Press, Boston, MA.
Giles, P. K., and Galvin, T. P. (1996). Center of Gravity: Determination, Analysis and Application.
Carlisle Barracks, U.S. Army War College, Carlisle, PA.
Goodman, D., and Keene, R. (1997). Man versus Machine: Kasparov versus Deep Blue, H3 Publica-
tions, Cambridge, MA.
Gruber, T. R. (1993). A Translation Approach to Portable Ontology Specification. Knowledge Acquisi-
tion, vol. 5, pp. 199–220.
Guizzardi, G., and Wagner, G. (2005a). Some Applications of a Unified Foundational Ontology in
Business, in Rosemann, M., and Green, P. (eds.), Ontologies and Business Systems Analysis, IDEA
Group, Hershey, PA.
Guizzardi, G., and Wagner, G. (2005b). Towards Ontological Foundations for Agent Modeling
Concepts Using UFO, in Agent-Oriented Information Systems (AOIS), selected revised papers of
the Sixth International Bi-Conference Workshop on Agent-Oriented Information Systems.
Springer-Verlag, Berlin and Heidelberg, Germany.
Hieb, M. R. (1996). Training Instructable Agents through Plausible Version Space Learning, PhD
Dissertation, School of Information Technology and Engineering, George Mason University, Fair-
fax, VA.
Hobbs, J. R., and Pan, F. (2004). An Ontology of Time for the Semantic Web, CM Transactions on
Asian Language Processing (TALIP), vol. 3, no. 1 (special issue on temporal information process-
ing), pp. 66–85.
Horvitz, E. (1999). Principles of Mixed-Initiative User Interfaces, in Proceedings of CHI '99, ACM
SIGCHI Conference on Human Factors in Computing Systems, Pittsburgh, PA, May. ACM Press,
New York, NY. research.microsoft.com/~horvitz/uiact.htm (accessed April 13, 2016)
Humphreys, B. L., and Lindberg, D.A.B. (1993). The UMLS Project: Making the Conceptual Connec-
tion between Users and the Information They Need, Bulletin of the Medical Library Association,
vol. 81, no. 2, p. 170.
Jackson, P. (1999). Introduction to Expert Systems, Addison-Wesley, Essex, UK.
Jena (2012). Jena tutorial. jena.sourceforge.net/tutorial/index.html (accessed August 4, 2012).
JESS (2016). The rule engine for the JAVA platform, JESS webpage: www.jessrules.com/jess/down
load.shtml (accessed February 3, 2016)
Joint Chiefs of Staff (2008). Joint Operations, Joint Pub 3-0, U.S. Joint Chiefs of Staff, Washington, DC.
Jones, E. (1998). HPKB Year 1 End-to-End Battlespace Challenge Problem Specification, Alphatech,
Burlington, MA.
Kant, I. (1781). The Critique of Pure Reason, Project Gutenberg, www.gutenberg.org/ebooks/4280
(accessed, August 19, 2013)
Keeling, H. (1998). A Methodology for Building Verified and Validated Intelligent Educational
Agents – through an Integration of Machine Learning and Knowledge Acquisition, PhD Dissertation,
School of Information Technology and Engineering, George Mason University, Fairfax, VA.
Kent, S. (1994). Words of Estimated Probability, in Steury, D. P. (ed.), Sherman Kent and the Board of
National Estimates: Collected Essays, Center for the Study of Intelligence, CIA, Washington, DC.
Kim, J., and Gil, Y. (1999). Deriving Expectations to Guide Knowledge Base Creation, in Proceedings
of AAAI-99/IAAI-99, AAAI Press, Menlo Park, CA, pp. 235–241.
Kneale, W. (1949). Probability and Induction, Clarendon Press, Oxford, UK. pp. 30–37.
Kodratoff, Y., and Ganascia, J-G. (1986). Improving the Generalization Step in Learning, in Michalski,
R., Carbonell, J., and Mitchell, T. (eds.), Machine Learning: An Artificial Intelligence Approach,
vol. 2. Morgan Kaufmann, San Mateo, CA, pp. 215–244.
Kolmogorov, A. N. (1933 [1956]). Foundations of a Theory of Probability, 2nd English ed., Chelsea,
New York, NY, pp. 3–4.
Kolmogorov, A. N. (1969). The Theory of Probability, in Aleksandrov, A. D., Kolmogorov, A. N., and
Lavrentiev, M. A. (eds.), Mathematics: Its Content, Methods, and Meaning, vol. 2, MIT Press,
Cambridge, MA, pp. 231–264.
Langley, P. W. (2012). The Cognitive Systems Paradigm, Advances in Cognitive Systems, vol. 1, pp. 3–13.
13:56:23,
.015
References 437
Laplace, P. S. (1814). Théorie Analytique des Probabilités, 2nd édition, Paris, Ve. Courcier, archive.
org/details/thorieanalytiqu01laplgoog (accessed January 28, 2016)
Le, V. (2008). Abstraction of Reasoning for Problem Solving and Tutoring Assistants. PhD Dissertation
in Information Technology. Learning Agents Center, Volgenau School of IT&E, George Mason
University, Fairfax, VA.
Lempert, R. O., Gross, S. R., and Liebman, J. S. (2000). A Modern Approach to Evidence, 3rd ed., West
Publishing, St. Paul, MN, pp. 1146–1148.
Lenat, D. B. (1995). Cyc: A Large-scale Investment in Knowledge Infrastructure, Communications of
the ACM, vol. 38, no. 11, pp. 33–38.
Loom (1999). Retrospective on LOOM. www.isi.edu/isd/LOOM/papers/macgregor/Loom_Retrospec
tive.html (accessed August 4, 2012)
MacGregor, R. (1991). The Evolving Technology of Classification-Based Knowledge Representation
Systems, in Sowa, J. (ed.), Principles of Semantic Networks: Explorations in the Representations of
Knowledge, Morgan Kaufmann, San Francisco, CA, pp. 385–400.
Marcu, D. (2009). Learning of Mixed-Initiative Human-Computer Interaction Models, PhD Disserta-
tion in Computer Science. Learning Agents Center, Volgenau School of IT&E, George Mason
University, Fairfax, VA.
Marcus, S. (1988). SALT: A Knowledge-Acquisition Tool for Propose-and-Revise Systems, in Marcus,
S. (ed.), Automating Knowledge Acquisition for Expert Systems, Kluwer Academic., Norwell, MA,
pp. 81–123.
Masolo, C., Vieu, L., Bottazzi, E., Catenacci, C., Ferrario, R., Gangemi, A., and Guarino, N. (2004).
Social Roles and Their Descriptions, in Dubois, D., Welty, C., and Williams, M-A. (eds.), Principles
of Knowledge Representation and Reasoning: Proceedings of the Ninth International Conference
(KR2004), AAAI Press, Menlo Park, CA, pp. 267–277.
McDermott, J. (1982). R1: A Rule-Based Configurer of Computer Systems, Artificial Intelligence
Journal, vol. 19, no. 1, pp. 39–88.
Meckl, S., Tecuci, G., Boicu, M., and Marcu, D. (2015). Towards an Operational Semantic Theory of
Cyber Defense against Advanced Persistent Threats, in Laskey, K. B., Emmons, I., Costa, P. C. G.,
and Oltramari, A. (eds.), Proceedings of the Tenth International Conference on Semantic Technolo-
gies for Intelligence, Defense, and Security – STIDS 2015, pp. 58–65, Fairfax, VA, November 18–20.
lac.gmu.edu/publications/2015/APT-LAC.pdf (accessed January 12, 2016)
Michalski, R. S. (1986). Understanding the Nature of Learning: Issues and Research Directions, in
Michalski, R. S., Carbonell, J. G., and Mitchell T. (eds.), Machine Learning, vol. 2, Morgan
Kaufmann, Los Altos, CA, pp. 3–25.
Michalski, R. S., and Tecuci, G. (eds.) (1994). Machine Learning: A Multistrategy Approach, vol. IV,
Morgan Kaufmann, San Mateo, CA. store.elsevier.com/Machine-Learning/isbn-9781558602519/
(accessed May 29, 2015)
Minsky, M. (1986). The Society of Mind, Simon and Schuster, New York, NY.
Mitchell, T. M. (1978). Version Spaces: An Approach to Concept Learning. PhD Dissertation, Stanford
University, Stanford, CA.
Mitchell, T. M. (1997). Machine Learning. McGraw-Hill, New York, NY.
Mitchell, T. M., Keller, R. M., and Kedar-Cabelli, S. T. (1986). Explanation-Based Generalization:
A Unifying View, Machine Learning, vol. 1, pp. 47–80.
Murphy, P. (2003). Evidence, Proof, and Facts: A Book of Sources. Oxford University Press, Oxford, UK.
Musen, M. A. (1989). Automated Generation of Model-based Knowledge Acquisition Tools, Morgan
Kaufmann., San Francisco, CA.
NAE (National Academy of Engineering) (2008). Grand Challenges for Engineering. www.engineer
ingchallenges.org/cms/challenges.aspx (accessed April 13, 2016)
Nau, D., Au, T., Ilghami, O., Kuter, U., Murdock, J., Wu, D., and Yaman, F. (2003). SHOP2: An HTN
Planning System, Journal of Artificial Intelligence Research, vol. 20, pp. 379–404.
Negoita, C. V., and Ralescu, D. A. (1975). Applications of Fuzzy Sets to Systems Analysis, Wiley,
New York, NY.
Nilsson, N. J. (1971). Problem Solving Methods in Artificial Intelligence, McGraw-Hill, New York, NY.
Nonaka, I., and Krogh, G. (2009). Tacit Knowledge and Knowledge Conversion: Controversy and
Advancement in Organizational Knowledge Creation Theory, Organization Science, vol. 20, no. 3
13:56:23,
.015
438 References
13:56:23,
.015
References 439
Puppe, F. (1993). Problem Classes and Problem Solving Methods, in Systematic Introduction to
Expert Systems: Knowledge Representations and Problem Solving Methods, Springer Verlag, Berlin
and Heidelberg, Germany, pp. 87–112.
Ressler, J., Dean, M., and Kolas, D. (2010). Geospatial Ontology Trade Study, in Janssen, T., Ceuster,
W., and Obrst, L. (eds.), Ontologies and Semantic Technologies for Intelligence, IOS Press, Amster-
dam, Berlin, Tokyo, and Washington, DC, pp. 179–212.
Rooney, D., Hearn, G., and Ninan, A. (2005). Handbook on the Knowledge Economy, Edward Elgar,
Cheltenham, UK.
Russell, S. J., and Norvig, P. (2010). Artificial Intelligence: A Modern Approach, Prentice Hall, Upper
Saddle River, NJ, pp. 34–63.
Schneider, L. (2003). How to Build a Foundational Ontology – the Object-centered High-level
Reference Ontology OCHRE, in Proceedings of the 26th Annual German Conference on AI, KI
2003: Advances in Artificial Intelligence, Springer-Verlag, Heidelberg, Germany, pp. 120–134.
Schreiber, G., Akkermans, H., Anjewierden, A., de Hoog, R., Shadbolt, N., Van de Velde, W., and
Wielinga, B. (2000). Knowledge Engineering and Management: The Common KADS Methodology,
MIT Press, Cambridge, MA.
Schum D. A. (1987). Evidence and Inference for the Intelligence Analyst (2 vols), University Press of
America, Lanham, MD.
Schum, D. A. (1989). Knowledge, Probability, and Credibility, Journal of Behavioral Decision Making,
vol. 2, pp. 39–62.
Schum, D. A. (1991). Jonathan Cohen and Thomas Bayes on the Analysis of Chains of Reasoning, in
Eells, E., and Maruszewski, T. (eds.), Probability and Rationality: Studies on L. Jonathan Cohen’s
Philosophy of Science, Editions Rodopi, Amsterdam, Netherlands, pp. 99–145.
Schum, D. A. (1999). Marshaling Thoughts and Evidence during Fact Investigation, South Texas Law
Review, vol. 40, no. 2 (summer), pp. 401–454.
Schum, D. A. (1994 [2001a]). The Evidential Foundations of Probabilistic Reasoning, Northwestern
University Press, Evanston, IL.
Schum, D. A. (2001b). Species of Abductive Reasoning in Fact Investigation in Law, Cardozo Law
Review, vol. 22, nos. 5–6, pp. 1645–1681.
Schum, D. A. (2011). Classifying Forms and Combinations of Evidence: Necessary in a Science of
Evidence, in Dawid, P., Twining, W., and Vasilaki. M. (eds.), Evidence, Inference and Inquiry,
British Academy, Oxford University Press, Oxford, UK, pp. 11–36.
Schum, D. A., and Morris, J. (2007). Assessing the Competence and Credibility of Human Sources
of Evidence: Contributions from Law and Probability, Law, Probability and Risk, vol. 6,
pp. 247–274.
Schum, D. A., Tecuci, G., and Boicu, M. (2009). Analyzing Evidence and Its Chain of Custody:
A Mixed-Initiative Computational Approach, International Journal of Intelligence and Counter-
intelligence, vol. 22, pp. 298–319. lac.gmu.edu/publications/2009/Schum%20et%20al%20-%
20Chain%20of%20Custody.pdf (accessed April 13, 2016)
Shafer, G. (1976). A Mathematical Theory of Evidence, Princeton University Press, Princeton, NJ.
Shafer, G. (1988). Combining AI and OR, University of Kansas School of Business Working Paper
No. 195, April.
Simon, H. (1983). Why Should Machines Learn? in Michalski, R. S., Carbonell, J. G., and Mitchell,
T.M. (eds.), Machine Learning, vol. 1, Morgan Kaufmann, Los Altos, CA, pp. 25–38.
Simonite, T. (2013). Bill Gates: Software Assistants Could Help Solve Global Problems, MIT Technol-
ogy Review, July 16. www.technologyreview.com/news/517171/bill-gates-software-assistants-
could-help-solve-global-problems/ (accessed April 13, 2016)
Siri (2011). Apple’s Siri homepage. www.apple.com/ios/siri/ (accessed April 12, 2016)
Strange, J. (1996). Centers of Gravity & Critical Vulnerabilities: Building on the Clausewitzian Foundation
so That We Can All Speak the Same Language, Marine Corps University Foundation, Quantico, VA.
Strange, J., and Iron, R. (2004a). Understanding Centers of Gravity and Critical Vulnerabilities, Part 1:
What Clausewitz (Really) Meant by Center of Gravity. www.au.af.mil/au/awc/awcgate/usmc/cog1.pdf
(accessed May 25, 2015)
Strange, J., and Iron, R. (2004b). Understanding Centers of Gravity and Critical Vulnerabilities, Part 2:
The CG-CC-CR-CV Construct: A Useful Tool to Understand and Analyze the Relationship between
13:56:23,
.015
440 References
13:56:23,
.015
References 441
13:56:23,
.015
442 References
13:56:23,
.015
Index
447
13:58:13,
448 Index
13:58:13,
Index 449
13:58:13,
450 Index
13:58:13,
Index 451
13:58:13,
452 Index
13:58:13,
Index 453
13:58:13,
454 Index
13:58:13,
Index 455
13:58:13,