0% found this document useful (0 votes)
159 views55 pages

Module 3 Notes_AI

Module III of the Artificial Intelligence course focuses on symbolic reasoning under uncertainty, exploring techniques for problem-solving with incomplete models. It discusses non-monotonic reasoning, which allows for conclusions to change with new information, and introduces various logical frameworks and computational methods for handling uncertainty. The module also covers approaches such as default reasoning, abduction, and minimalist reasoning, emphasizing the importance of consistency and the handling of conflicting information.

Uploaded by

rajsharma040424
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
159 views55 pages

Module 3 Notes_AI

Module III of the Artificial Intelligence course focuses on symbolic reasoning under uncertainty, exploring techniques for problem-solving with incomplete models. It discusses non-monotonic reasoning, which allows for conclusions to change with new information, and introduces various logical frameworks and computational methods for handling uncertainty. The module also covers approaches such as default reasoning, abduction, and minimalist reasoning, emphasizing the importance of consistency and the handling of conflicting information.

Uploaded by

rajsharma040424
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 55

Introduction to AI Module 3

BAI654D
Artificial Intelligence Module -III

MODULE III

SYMBOLIC REASONING UNDER UNCERTAINTY

In this chapter and the next, we explore and discuss techniques for solving problems with incomplete
and uncertain models.

What is Reasoning?

➢ Reasoning is the act of deriving a conclusion from certain premises using a given
methodology.
➢ Reasoning is a process of thinking, logically arguing and drawing inferences.

➢ UNCERTAINTY IN REASONING

o The world is an uncertain place; often the Knowledge is imperfect which causes
uncertainty.
o Therefore, reasoning must be able to operate under uncertainty.
o Uncertainty is major problem in knowledge elicitation, especially when the
expert's knowledge must be quantized in rules.
o Uncertainty may cause bad treatment in medicine, loss of money in business.
o
➢ INTRODUCTION TO NONMONOTONIC REASONING

➢ Non-Monotonic Logic/Reasoning: A non-monotonic logic is a formal logic whose


consequence relation is not monotonic.

➢ Logic is non-monotonic, if the truth of a proposition may change when new information
(axioms) is added. i.e., Non-monotonic logic allows a statement to be retracted (taken
back). Also used to formalize plausible (believable) reasoning.
Example 1: Birds typically fly.
Tweety is a bird.
Tweety flies (most probably).
- Conclusion of non-monotonic argument may not be correct.

✓ The techniques that can be used to reason effectively even when a complete, consistent, and
constant model of the world is not available are discussed here. One of the examples, which
we call the ABC Murder story, clearly illustrates many of the main issues these techniques must
deal with.
✓ Let Abbott, Babbitt, and Cabot be suspects in a murder case. Abbott has an alibi (explanation/
defense), in the register of a respectable hotel in Albany. Babbitt also has an alibi for his brother-
in-law testified that Babbitt was visiting him in Brooklyn at the time. Cabot pleads
Artificial Intelligence Module -III

alibi too, claiming to have been watching a ski meet in the Catskills, but we have only his word
for that. So we believe -
1. That Abbott did not commit the crime.
2. That Babbitt did not commit the crime.
3. That Abbott or Babbitt or Cabot did.
✓ But presently Cabot documents his alibi that he had the good luck to have been caught by
television in the sidelines at the ski meet. A new belief is thus thrust upon us:
4. That Cabot did not.
✓ Our beliefs (1) through (4) are inconsistent, so we must choose one for rejection. Which has the
weakest evidence? The basis for (1) in the hotel register is good, since it is a fine old hotel. The
basis for (2) is weaker, since Babbitt’s brother-in-law might be lying. The basis for (3) is perhaps
twofold: that there is no sign of burglary (robbery) and that only Abbott, Babbitt, and Cabot
seem to have stood to gain from the murder apart from burglary. This exclusion of burglary
seems conclusive, but the other consideration does not. There could be some fourth beneficiary.
For (4), finally, the basis is conclusive: the evidence from television. Thus (2) and
(3) are the weak points. To resolve the inconsistency of (1) through (4) we should reject (2) or
(3), thus either incriminating (drop some body) Babbitt or widening our net for some new suspect.
✓ This story illustrates some of the problems posed by uncertain, fuzzy, and often changing
knowledge. A variety of logical frameworks and computational methods have been proposed
for handling such problems.

Two Approaches for Reasoning


➢ Nonmonotonic reasoning, in which the axioms and/or the rules of inference are
extended to make it possible to reason with incomplete information. These systems
preserve the property that, at any given moment, a statement is either believed to
be true, believed to be false, or not believed to be either.

➢ Statistical reasoning in which the representation is extended to allow some kind


of numeric measure of certainty (not simply TRUE or FALSE) to be associated
with each statement.
Artificial Intelligence Module -III
▪ In Conventional reasoning systems, first-order predicate logic, are designed to
work with information and that has three important properties:

➢ It is complete with respect to the domain of interest. In other words, all the facts
that are necessary to solve a problem are present in the system or can be derived
from existing facts with rules of first-order logic.

➢ It is consistent.

➢ The new facts can be added as they become available. If these new facts are
consistent with all the other facts that have already been asserted, then nothing
will ever be retracted (taken back) from the set of facts that are known to be true.
This property is called monotonicity. In other words, logic is monotonic if the
truth of a proposition does not change when new information (axioms) is added.

The traditional logic like FOPL is monotonic.


▪ If any of these above properties is not satisfied, conventional logic-based reasoning
systems become inadequate. Then Nonmonotonic reasoning systems are designed
to handle such problems in which all of these properties may be missing.
In order to do this, we must address several key implementation issues, including the
following:

1. How can the knowledge base be extended to be made based on lack of


knowledge as well as on the presences of it?
2. How can the knowledge base be updated properly when a new fact is added to
the system (or when an old one is removed)?
The usual solution to this problem is keep track of proofs, which are often called
justifications.
3. How can knowledge be used to help resolve conflicts when there are several
inconsistent nonmonotonic inferences that could be drawn?
➢ LOGICS FOR NONMONOTONIC REASONING

Because monotonicity is fundamental to the definition of first-order predicate logic,


we are forced to find some alternative to support nonmonotonic reasoning. We
examine several because no single formalism with all the desired properties has yet
emerged. In particular, we would like to find a formalism that does all of the following
things.
- Defines the set of possible worlds that could exist, given the facts that we do have.
- Provides a way to say that we prefer to believe in some models rather than others.
Artificial Intelligence Module -III
- Provides the basis for the practical implementation of this kind of reasoning.
Corresponds to our intuitions about how this kind of reasoning works.

- Logics for NMR (Nonmonotonic reasoning)


Systems Default Reasoning
1. Non-monotonic Logic
2. Default Logic
3. Abduction
4. Inheritance
Minimalist Reasoning
1. The Closed World Assumption
2. Circumscription

- Models and Interpretations


Define an interpretation of a set of wff’s consistent of
1. A domain (D)
2. A function(f) that assigns
- to each predicate a relation
- to each n-ary function an operator that maps from Dn into D
- to each constant an element of D
- Define a model of a set of wff’s, an interpretation that satisfies them.

✓ Models, Wff’s, and the importance of Non-monotonic


Reasoning A model of a set of wff’s is an interpretation
that satisfies them.

• Default Reasoning
✓ This is a very common form of non-monotonic reasoning. The conclusions are
drawn based on what is most likely to be true. There are two approaches for
Default reasoning and both are logic type: Non-monotonic logic and Default logic.
1. Nonmonotonic Logic
- Provides a basis for default reasoning.
- It has already been defined. It says, "The truth of a proposition may change
when new information (axioms) is added and logic may be build to allow the
statement to be retracted."
- Non-monotonic logic is a predicate logic with one extension called modal
operator M
which means “consistent with everything we know”. The purpose of M is
to allow consistency. i.e., FOPL is augmented with a modal operator M and
can be read as “is consistent”.
- Here the Rules are Wff’s.
- A way to define consistency with PROLOG notation is :
Artificial Intelligence Module -III
To show that fact P is true, we attempt to prove ¬P.
If we fail, we may say that P is consistent since ¬P is false.

Examples 1:
x, y : Related(x , y)  M GetAlong(x, y) →  WillDefend(x , y)

Should be read as “For all x and y, if x and y are related and if the fact that x
gets along with y is consistent with everything else that is believed, then
conclude that x will not defend y”

∀x: plays_instrument(x) 𝖠 M manage(x) → jazz_musician(x)


States that “For all x, the x plays an instrument and if the fact that x can manage is
consistent with all other knowledge then we can conclude that x is a jazz musician.

2. Default Logic
- Default logic initiates a new inference rule: A : B
C
Where, A is known as the prerequisite, B as the justification, and C as the
consequent. Read the above inference rule as: " if A is provable and if it is
consistent to assume B, then conclude C ". The rule says that given the
prerequisite, the consequent can be inferred, provided it is consistent with
the rest of the data.
- Example : Rule that "birds typically fly" would be represented as
bird(x) : flies(x)
flies (x)
which says " If x is a bird and the claim that x flies is consistent with what
we know, then infer that x flies".
Since, all we know about Tweety is that: Tweety is a bird, we therefore
inferred that
Tweety flies.
- These inferences are used as basis for computing possible set of
extensions to the knowledge base.
- Here, Rules are not Wff’s
- Applying Default Rules :
While applying default rules, it is necessary to check their justifications for
consistency, not only with initial data, but also with the consequents of any
other default rules that may be applied. The application of one rule may thus
block the application of another. To solve this problem, the concept of default
theory was extended.

- The idea behind non-monotonic reasoning is to reason with first order


Artificial Intelligence Module -III
logic, and if an inference cannot be obtained then use the set of default rules
available within the first order
formulation.

3. Abduction
- Abduction means systematic guessing: "infer" an assumption from a
conclusion.
- Definition: "Given two Wffs: A→B and B, for any expressions A and B, if it
is consistent to assume A, do so".
- Refers to deriving Conclusions, applying the implications in reverse.
- For example, the following formula:
∀x: RainedOn(x) → wet(x)
could be used "backwards" with a
specific x: if wet(Tree) then
RainedOn(Tree)
This, however, would not be logically justified. We could say:
wet(Tree)  CONSISTENT(rainedOn(Tree)) → rainedOn(Tree)
We could also attach probabilities, for example like
this: wet(Tree) → rainedOn(Tree)70%||
wet(Tree) → morningDewOn(Tree) 20%
|| wet(Tree) → sprinkledOn(Tree) 10% ||

Example: Given x: Measles(x) → Spots(x) Spots (Jill)conclude Measles(Jill)

In many domains, abductive reasoning is particularly useful if some measure of


certainty is attached to the resulting expressions.
Inheritance
- Consider baseball knowledge base described in chapter 4.

- The concept is “An object inherits attribute values from all the classes of

which it is a member, if not doing so leads to a contradiction, in which case a


value from a more restricted class has precedence over a value from a border
class.”
- These logical ideas provide a basis for describing this idea more formally and

can write its inheritable knowledge as rules in Default Logic.


- We can write a rule to account for the inheritance of a default value for the height

of a baseball player as:


Artificial Intelligence Module -III
Baseball-player(x): height (x, 6-1)
height (x, 6-1)
- Assert Pitcher(Three-Finger-Brown). This concludes that Three-Finger-Brown
is a baseball player. This rule allows us to conclude that his height is 6-1.
- If, on the other hand, we had asserted a conflicted value for Three Finger
had an axiom like

x, y, z : height(x, y)  height(x, z) → y = z


Artificial Intelligence Module -III
Artificial Intelligence Module -III

-
Artificial Intelligence Module -III
- Which is not allowed to have more than one height, and then we would not be
able to apply the default rule. Thus an explicitly stated value will block the
inheritance of a default value which is exactly what we want.
- But now, let’s encode the default rule for the height of adult males in general.

Adult-Male(x): height(x, 5-10)


height(x, 5-10)
- This rule does not work. If we again assert Pitcher (Three-Finger-Brown), then
the resulting theory contains two extensions: first rule fires and Brown’s height
is 6-1 and this new rule applies and Brown’s height is 5-10. Neither of these
extensions is preferred.
- we could rewrite the default rule for adult males in general as:
Adult –Male(x): Baseball-Player(x)  Midget(x)  - Jockey(x)  height (x, 5-
10)
height(x, 5-10)

- A clearer approach is to say something like, “Adult males typically have a


height of 5-10 unless they are abnormal in some way.” So we could write,
for example:

x : Adult-Male(x)  AB (x,aspect1) →height(x, 5-10)


x : Baseball-player(x) →AB(x, aspect 1)
x : Midget(x) → AB(x, aspect 1)
x : Jockey(x) → AB(x, aspect 1)

Then, if we add the single default rule:


:  AB(x, y)
 AB(x, y)
We get the desired result.
- This effectively blocks the application of the default knowledge about adult
males in the case that more specific information from the class of baseball
player is available.

• Minimalist Reasoning

✓ The idea behind using minimal models as a basis for nonmonotonic reasoning
about the world is the following: “There are many fewer true statements than false
ones. If something is true and relevant it makes sense to assume that it has been
entered into the knowledge base. Therefore, assume that the only true statements
are those that necessarily must be true to maintain the consistency of the
Artificial Intelligence Module -III
knowledge base.”

1. The Closed World Assumption


- A simple kind of minimalist reasoning is the Closed World Assumption or
CWA. The
CWA says “that the only objects that satisfy any predicate P are those that
must”.
- The CWA is particularly powerful as a basis for reasoning with databases,
which are assumed to be complete with respect to the properties they
describe. For example,
A personnel database can safely be assumed to list all of the company’s
employees. If someone asks whether Smith works for the company, we should
reply “no” unless he is explicitly listed as an employee.
- Some worlds are not closed. It can fail to produce an appropriate answer for
either of two reasons:
The first is that its assumptions are not always true in the world. Some parts
of the world are not realistically “closable”.
The second kind of problem is that the CWA arises from the fact that it is a
purely syntactic reasoning process and its results depend on the form of
the assertions that are provided.
- Let’s look at two specific examples of this problem. Consider a
knowledge base that consists of just a single statement:
A (Joe)  B (Joe)

We derive: A (Joe)
B (Joe)

The CWA allows us to conclude both ? A (Joe) and ? B (Joe), since neither
A nor B must necessarily be true of Joe. So, the resulting extended
knowledge base is inconsistent.
- The problem is that we have assigned a special status to positive instances
of predicates, as opposed to negative ones. Specifically, the CWA forces
completion of knowledge base by adding the negative assertion P
whenever it is consistent to do so. But the assignment of a real world
property to some predicate P and its complement to the negation of P may
be arbitrary. For example, suppose we define a predicate single and create
the following knowledge base:
Single (John)
Single (Mary)
Then, if we ask about Jane, the CWA will yield the answer Single (Jane).
Artificial Intelligence Module -III
But now suppose we had chosen instead to use the predicate Married rather than
Single. Then the corresponding knowledge base would be
Married (John)
Married (Mary)
If we now ask about Jane, the CWA will yield the result
Married (Jane).

2. Circumscription
- Circumscription is a Nonmonotonic logic to formalize the common sense

assumption. Circumscription is a formalized rule of conjecture (guess) that


can be used along with the rules of inference of first order logic.
- Circumscription involves formulating rules of thumb with "abnormality"

predicates and then restricting the extension of these predicates,


circumscribing them, so that they apply only those things to which they are
currently known.
- Example: Take the case of Bird Tweety
The rule of thumb is that "birds typically fly" is conditional. The predicate
"Abnormal" signifies abnormality with respect to flying ability.
Observe that the rule ∀ x: (Bird(x)  Abnormal(x) → Flies(x)) does
not allow us to infer that "Tweety flies", since we do not know that it is
abnormal with respect to flying ability.
But if we add axioms which circumscribe the abnormality predicate to
which they are currently known say "Bird Tweety" then the inference
can be drawn. This inference is non-monotonic.

- Two advantages over CWA :


Operates on whole formulas, not individual predicates.
Allows some predicates to be marked as closed and others as open.

➢ IMPLEMENTATION ISSUES

✓ The issues and weaknesses related to implementation of Nonmonotonic


reasoning in problem solving are:
1. How to derive exactly those Nonmonotonic conclusions that are relevant
to solving the problem at hand while not wasting time on those that are not
necessary.
2. How to update our knowledge incrementally as problem solving progresses.
Artificial Intelligence Module -III
3. How to overcome the problem where more than one interpretation of the
known facts is
qualified or approved by the available inference rules.
4. In general the theories are not computationally effective, decidable or semi
decidable.

✓ The solutions are offered, considering the reasoning processes into two parts:
- One, a problem solver that uses whatever mechanism it happens to
have to draw conclusions as necessary, and
- Second, a truth maintenance system whose job is to maintain consistency
in knowledge representation of a knowledge base.

✓ Search controls used are:

- Depth-first search

- Breadth-first search

➢ AUGMENTING A PROBLEM-SOLVER

✓ Problem-solving can be done using either forward or backward reasoning.

Problem-solving using uncertain knowledge is no exception. As a result, there are


two basic approaches to this kind of problem-solving.
- Reason forward from what is known: Treat nonmonotonically derivable

conclusions the same way monotonically derivable ones are handled.


Nonmonotonic reasoning systems that support this kind of reasoning allow
standard forward-chaining rules to be augmented with UNLESS (except)
clauses, which introduced a basis for reasoning by default. Control is handled
in the same way that all other control decision in the system is made.

- Reason backward to determine whether some expression P is true.


Nonmonotonic reasoning systems that support this kind of reasoning may do
either or both of the following two things:
Allow default clause in backward rules. Resolve conflicts among defaults
using the same, control strategy that is used for other kinds of reasoning.
Support a kind of debate in which an attempt is made to construct arguments
both in favor of P and opposed to it. Then some additional knowledge is
applied to the arguments to determine which side has the stronger case.
✓ Let’s look at backward reasoning first. We will begin the simple case of backward
Artificial Intelligence Module -III
reasoning in which we attempt to prove an expression P. suppose that we have a
knowledge base that consists of the backward rules shown in figure below.
Figure: Backward Rules Using UNLESS

✓ Knowledge base the usual PROLOG- style control structure in which rules are
matched top to bottom, left to right. Then if we ask the question? Suspect {x}, the
program will first try Abbott and return Abbott as its answer. If we had also
included the facts.
RegisteredHotel(Abbott,
Albany) FarAway(Albany)

Then, the program would have failed to conclude that Abbott was a suspect
and it would instead have located Babbitt and then Cabot.
✓ Figure below shows how the same knowledge could be represented as
forward rules.
✓ Figure: Forward Rules Using UNLESS
Artificial Intelligence Module -III

➢ IMPLEMENTATION: DEPTH-FIRST SEARCH

• Dependency-Directed Backtracking

✓ Depth-first approach to nonmonotonic reasoning: We need to know a fact F,

which can be derived by making some assumption A, which seems believable. So


we make assumption A, derive F, and then derive some additional facts G and H
from F. We later derive some other facts M and N, but they are completely
independent of A and F. Later, a new fact comes in that individual A. We need to
withdraw our proof of F, and our proofs of G and H since they depended on F.
But what about M and N? They didn’t depend on F, so there is no logical need to
invalidate them. But if we use a conventional backtracking scheme, we must back
up past conclusions in the order in which we derived them, so we have to backup
past M and N, thus undoing them, in order to get back to F,G, H and A. To get
around this problem, we need a slightly different notice of backtracking, one that
is based on logical dependencies rather than the chronological order in which
Artificial Intelligence Module -III
decisions were made. We call this new method dependency- directed
backtracking.

✓ As an example, suppose we want to build a program that generates a solution to a

fairly simple problem. Finding a time at which three busy people can all attend a
meeting? One way to solve such a problem is first to make an assumption that the
meeting will be held on some particular day, say Wednesday, add to the database.
Then proceed to find a time, checking along the way for any inconsistencies in
people’s schedules. If a conflict arises, the statement representing the assumption
must be discarded and replaced by another, hope fully non- contradictory, one.
This kind of solution can be handled by a straightforward tree search with chronological
backtracking. All assumptions, as well as the inferences drawn from them, are recorded at
the search node that created them. When a node is determined to represent a contradiction,
simply backtrack to the next node from which there remain unexplored paths. The
assumptions and their inferences will disappear automatically.
Artificial Intelligence Module -III
Artificial Intelligence Module -III

• Justification-Based Truth Maintenance Systems

✓ The idea of a Truth Maintenance System (TMS) is to provide the ability to do

dependency- directed backtracking and to support nonmonotonic reasoning.


✓ Truth Maintenance System (TMS) is a critical part of a reasoning system. Its

purpose is to assure that inferences made by the reasoning system (RS) are valid.
✓ The RS provides the TMS with information about each inference it performs, and

in return the TMS provides the RS with information about the whole set of
inferences.
✓ The TMS maintains the consistency of a knowledge base as soon as new knowledge
is added. It considers only one state at a time so it is not possible to manipulate
environment.
✓ Several implementations of TMS have been proposed for non-monotonic

reasoning. The important ones are the :


- Justification-Based Truth Maintenance Systems (JTMS)

- Logic-Based Truth Maintenance Systems (LTMS)

- Assumption-based Truth Maintenance Systems (ATMS).

✓ The TMS maintains consistency in knowledge representation of a knowledge base. The


functions of TMS are to:
- Provide justifications for conclusions
When a problem-solving system gives an answer to a user's query, an
explanation of that answer is required.
Example: An advice to a stockbroker is supported by an explanation of
the reasons for that advice. This is constructed by the Inference Engine
(IE) by tracing the justification of the assertion.
- Recognize inconsistencies

The Inference Engine (IE) may tell the TMS that some sentences are
contradictory. Then, TMS may find that all those sentences are believed
true, and reports to the IE that it can eliminate the inconsistencies by
determining the assumptions used and changing them appropriately.
Artificial Intelligence Module -III
Example: A statement that either Abbott, or Babbitt, or Cabot is guilty
together with other statements that Abbott is not guilty, Babbitt is not
guilty, and Cabot is not guilty, form a contradiction.
- Support default reasoning

In the absence of any firm knowledge, in many situations we want to reason


from default assumptions.
Example: If "Tweety is a bird", then until told otherwise, assume that "Tweety
flies" and for justification use the fact that "Tweety is a bird" and the assumption
that "birds fly".
In the notation of Default Logic, we can state the rule that produced it as
Beneficiary x : Alibi(x)
Suspect(x)
Backward Rules using UNLESS. The assertion Suspect Abbott has an associated
TMS justification. Each justification consists of two parts: an IN-list and an OUT-
list. In the figure, the assertions on the IN-list are connected to the justification by +
links, those on the OUT-list by - links. The justification is connected by an arrow to
the assertion that it supports. In the justification shown, there is exactly one assertion
in each list. Beneficiary Abbott is in the IN-list and Alibi Abbott is in the OUT-list.
Such a justification says that Abbott should be a suspect just when it is believed that
he is a beneficiary and it is not believed that he has an alibi.

More generally, assertion (usually called nodes) in a TMS dependency network is


believed when they have a valid justification. A justification is valid if every assertion
in the IN-list is believed and none of those in the OUT-list.
The other major task of a TMS is resolving contradictions. In a TMS, a
contradiction node does not represent a logical contradiction but rather a state of the
database explicitly declared to be undesirable.
Artificial Intelligence Module -III

At this point, we have described the key reasoning operations that are performed by a
JTMS:
- Consistent labeling

- Contradiction resolution

Also described a set of important reasoning operations that a JTMS does not
perform,
including:
- Applying rules to derive conclusions

- Creating justifications for the results of applying rules

- Choosing among alternative ways of resolving a contradiction

- Detecting contradictions

All of these operations must be performed by the problem-solving program that is using
the JTMS.
• Logic-Based Truth Maintenance Systems

A logic-based truth maintenance system (LTMS) is very similar to a


JTMS. It differs in one important way. In a JTMS, the nodes in the
network are treated as atoms by the TMS, which assumes no
relationships among them except the ones that are explicitly stated in
the justifications. In particular, a JTMS has no problem simultaneously labeling
both P and  P IN. For example, we could have represented explicitly both Lies
B-I-L and not Lies B-I-L and labeled both of them IN. No contradiction will be
detected automatically.
✓ In an LTMS, on the other hand, a contradiction would be asserted

automatically in such a case. If we had constructed the ABC example in an


LTMS system, we would not have created an explicit contradiction
corresponding to the assertion that there was no suspect. Instead we would
replace the contradiction node by one that asserted something like No Suspect.
Artificial Intelligence Module -III
Then we would assert Suspect. When No Suspect came IN, it would cause a
contradiction to be asserted automatically.

Artificial Intelligence Module -III
Artificial Intelligence Module -III
Artificial Intelligence Module -III
Artificial Intelligence Module -III
Artificial Intelligence Module -III

✓ The Assumption-Based Truth Maintenance System (ATMS) is an alternative way

of implementing nonmonotonic reasoning. In both JTMS and LTMS systems, a


single line of reasoning is pursued at a time, and dependency-directed
backtracking occurs whenever it is necessary to change the system’s assumptions.
✓ In an ATMS, alternative paths are maintained in parallel. Backtracking is

avoided at the expense of maintaining multiple contexts, each of which


corresponds to a set of consistent assumptions. As reasoning proceeds in an ATMS-
based system, the universe of consistent contexts is pruned as contradictions are
discovered. The remaining consistent contexts are used to label assertions, thus
indicating the contexts in which each assertion has a valid justification. Assertions
that do not have a valid justification in any consistent context can be pruned from
consideration by the problem solver. As the set of consistent contexts gets smaller,
so too does the set of assertions that can consistently be believed by the problem
solver. Essentially, an ATMS system works breadth-first, considering all possible
contexts at once, while both JTMS and LTMS systems operate depth-first.
✓ The ATMS, like the JTMS, is designed to be used in combination with a separate

problem solver. The Problem solver’s job is to:


- Create nodes that correspond to assertions

- Associate each node with one or more justifications, each of which

describes reasoning chain that led to the node.


- Inform the ATMS of inconsistent contexts.

✓ The role of the ATMS is then to:

- Propagate inconsistencies, thus ruling out contexts that include sub contexts

that are known to be inconsistent.


- Label each problem solver node with the contexts in which it has a valid

justification. This is done by combining contexts that correspond to the


components of a justification. In particular, given a justification of the form
A1  A2 … An → C
Artificial Intelligence Module -III

Assign as a context for the node corresponding to C the intersection of the contexts
corresponding to the nodes A1 through An. It is necessary to think of the set of
contexts that are defined by a set of assumptions as forming a lattice.
Artificial Intelligence Module -III

STATISTICAL REASONING
Several representation techniques that can be used to model belief systems in which, at any
given point, a particular fact is believed to be true, believed to be false, or not considered
one way or the other.

Let’s consider two classes of such problems:

The first class contains problems in which there is genuine randomness in the world.
Playing card games such as bridge and blackjack is a good example of this class.
Although in these problems it is not possible to predict the world with certainty, some
knowledge about the likelihood of various outcomes is available, and we would like to be
able to exploit it.

The second class contains problems that could, be modeled using the techniques we
described in the last chapter. In these problems, the relevant world is not random. It
behaves “normally” unless there is some kind of exception. Many common sense tasks
fall into this category, as do many expert reasoning tasks such as medical diagnosis. For
problems like this, statistical measures may serve a very useful function as summaries of
the world. We explore several techniques that can be used to augment knowledge
representation techniques with statistical measures that describe levels of evidence and
belief.

➢ PROBABILITY AND BAYES’ THEOREM

Bayes' theorem:
• Bayes' theorem is also known as Bayes' rule, Bayes' law, or Bayesian
reasoning, which determines the probability of an event with uncertain knowledge.
• In probability theory, it relates the conditional probability and marginal probabilities
of two random events.
• Bayes' theorem was named after the British mathematician Thomas Bayes.
The Bayesian inference is an application of Bayes' theorem, which is fundamental
to Bayesian statistics.
Artificial Intelligence Module -III
• It is a way to calculate the value of P(B|A) with the knowledge of P(A|B).
• Bayes' theorem allows updating the probability prediction of an event by observing
new information of the real world.

Example: If cancer corresponds to one's age then by using Bayes' theorem, we can
determine the probability of cancer more accurately with the help of age.

Bayes' theorem can be derived using product rule and conditional probability of event A
with known event B:
As from product rule we can write:
1. P(A ⋀ B)= P(A|B) P(B) or

Similarly, the probability of event B with known event A:

1. P(A ⋀ B)= P(B|A) P(A)

Equating right hand side of both the equations, we will get:

The above equation (a) is called as Bayes' rule or Bayes' theorem. This equation is
basic of most modern AI systems for probabilistic inference.

It shows the simple relationship between joint and conditional probabilities. Here,

P(A|B) is known as posterior, which we need to calculate, and it will be read as


Probability of hypothesis A when we have occurred an evidence B.

P(B|A) is called the likelihood, in which we consider that hypothesis is true, then we
calculate the probability of evidence.

P(A) is called the prior probability, probability of hypothesis before considering


Artificial Intelligence Module -III
the evidence P(B) is called marginal probability, pure probability of an evidence.
In the equation (a), in general, we can write P (B) = P(A)*P(B|Ai), hence the Bayes'
rule can be written as:

Where A1, A2, A3,. . , An is a set of mutually exclusive and exhaustive events.

Applying Bayes' rule:

Bayes' rule allows us to compute the single term P(B|A) in terms of P(A|B), P(B), and P(A).

This is very useful in cases where we have a good probability of these three terms and want

to determine the fourth one. Suppose we want to perceive the effect of some unknown cause,

and want to compute that cause, then the Bayes' rule becomes:
Artificial Intelligence Module -III
Example-1:

Question: what is the probability that a patient has diseases meningitis with a stiff neck?

Given Data:

A doctor is aware that disease meningitis causes a patient to have a stiff neck, and it occurs 80% of the
time. He is also aware of some more facts, which are given as follows:

o The Known probability that a patient has meningitis disease is 1/30,000.


o The Known probability that a patient has a stiff neck is 2%.

Let a be the proposition that patient has stiff neck and b be the proposition that patient has meningitis. , so
we can calculate the following as:

P(a|b) = 0.8

P(b) = 1/30000

P(a)= .02

Hence, we can assume that 1 patient out of 750 patients has meningitis disease with a stiff neck.

Example-2:

Question: From a standard deck of playing cards, a single card is drawn. The probability that the card
is king is 4/52, then calculate posterior probability P(King|Face), which means the drawn face card is
a king card.

Solution:

P(king): probability that the card is King= 4/52= 1/13

P(face): probability that a card is a face card= 3/13

P(Face|King): probability of face card when we assume it is a king = 1


Putting all values in equation (i) we will get:
Artificial Intelligence Module -III

Application of Bayes' theorem in Artificial intelligence:


✓ It is used to calculate the next step of the robot when the already executed step is

given.
✓ Bayes' theorem is helpful in weather forecasting.

✓ It can solve the Monty Hall problem.

✓ Bayesian statistics provide an attractive basis for an uncertain reasoning system.

As a result, several mechanisms for exploiting its power while at the same time
making it tractable have been developed. In the rest of the discussion, we explore
three of these:
- Attaching certainty factors to rules

- Bayesian networks

- Dempster-Shafer theory

➢ CERTAINTY FACTORS AND RULE-BASED SYSTEMS

✓ Certainty factors provides a simple way of updating probabilities given new evidence.

✓ The basic idea is to add certainty factors to rules, and use these to calculate the

measure of belief in some hypothesis. So, we might have a rule such as:
IF has-spots(X)
AND has-fever(X)
THEN has-measles(X) CF 0.5
✓ Certainty factors consist of two components a measure of belief and a measure of

disbelief. However, here we’ll assume that we only have positive evidence and equal
certainty factors with measures of belief.
✓ Certainty factors are related to conditional probabilities, but are not the same. For

one thing, we allow certainty factors of less than zero to represent cases where some
Artificial Intelligence Module -III
evidence tends to deny some hypothesis. Rich and Knight discuss how certainty
factors consist of two components: a measure of belief and a measure of disbelief.
However, here we'll assume that we only have positive evidence and equate certainty
factors with measures of belief.
✓ Suppose we have already concluded has-spots(fred) with certainty 0.3, and has-

fever(fred) with certainty 0.8. To work out the probability of has-measles(X) we


need to take account both of the certainties of the evidence and the certainty factor
attached to the rule. The certainty associated with the conjoined premise (has-
spots(fred) AND has-fever(fred)) is taken to be the minimum of the certainties
attached to each (ie min(0.3, 0.8) = 0.3). The certainty of the conclusion is the total
certainty of the premises multiplied by the certainty factor of the rule (ie, 0.3 x 0.5 =
0.15).
✓ If we have another rule drawing the same conclusion (e.g., measles(X)) then we will

need to update our certainties to reflect this additional evidence. To do this we


calculate the certainties using the individual rules (say CF1 and CF2), then combine
them to get a total certainty of (CF1
+ CF2 - CF1*CF2). The result will be a certainty greater than each individual
certainty, but still less than 1.
✓ Certainty factors (CF{h,e] is defined in terms of two components.

1. MB[h,e] → a measure (between 0 and 1) of belief in hypothesis “h” given the

evidence “e”.
• MB measures the extent to which the evidence supports evidence the
hypothesis.
• It is zero if the evidence fails to support the hypothesis.
2. MD[h,e] →a measure (between 0 and 1)of disbelief in hypothesis “h” given the

evidence “e”.
• MD measures the extent to which the evidence supports the
negation of the hypothesis.
• It is zero if the evidence supports
Artificial Intelligence Module -III
hypothesis. CF[h,e]=mb[h,e]-md[h,e]
Example 2:
The approach that we discuss here was found in the MYCIN system, which
attempts to recommend appropriate therapies for patients with bacterial
infections. It interacts with the physician to acquire the clinical data it needs.
MYCIN is an example of an expert system, since it performs a task normally
done by a human expert. Here we concentrate on the use of probabilistic
reasoning.
✓ MYCIN represents most of its diagnostic knowledge as a set of rules. Each rule

has associated with it a certainty factor, which is a measure of the extent to which
the evidence that is described by the antecedent of the rule supports the conclusion
that is given in the rule’s consequent. A typical MYCIN rule looks like:
If: 1. the stain of the organism is gram-positive, and
2. the morphology of the organism is coccus, and
3. the growth conformation of the organism is clumps (cluster),
then there is suggestive evidence (0.7) that the identity of the organism is
staphylococcus.
This is the form in which the rules are stated to the user.
➢ BAYESIAN NETWORKS

✓ Here, we describe an alternative approach known as Bayesian networks. The main


idea is that to describe the real world, it is not necessary to use a huge joint
probability table in which we list the probabilities of all conceivable combinations
of events. Here, we can use a more local representation in which we will describe
clusters of events that interact.
✓ "A Bayesian network is a probabilistic graphical model which represents a set of
variables and their conditional dependencies using a directed acyclic graph."
✓ It is also called a Bayes network, belief network, decision network, or Bayesian
model.
✓ Bayesian networks are probabilistic, because these networks are built from a
Artificial Intelligence Module -III
probability distribution, and also use probability theory for prediction and anomaly
detection.
✓ Real world applications are probabilistic in nature, and to represent the relationship
between multiple events, we need a Bayesian network. It can also be used in
various tasks including prediction, anomaly detection, diagnostics, automated
insight, reasoning, time series prediction, and decision making under
uncertainty.

Bayesian Network can be used for building models from data and experts opinions,
and it consists of two parts:
o Directed Acyclic Graph
o Table of conditional probabilities.
The generalized form of Bayesian network that represents and solve decision
problems under uncertain knowledge is known as an Influence diagram.
A Bayesian network graph is made up of nodes and Arcs (directed links), where:

o Each node corresponds to the random variables, and a variable can be continuous or discrete.

o Arc or directed arrows represent the causal relationship or conditional


probabilities between random variables. These directed links or arrows
connect the pair of nodes in the graph. These links represent that one
node directly influence the other node, and if there is no directed link that means
that nodes are independent with each other
o In the above diagram, A, B, C, and D are random variables
represented by the nodes of the network graph.

o If we are considering node B, which is connected with node A by a


directed arrow, then node A is called the parent of Node B.

o Node C is independent of node A.


Artificial Intelligence Module -III

Note: The Bayesian network


graph does not contain any
cyclic graph. Hence, it is
known as a directed acyclic
graph or DAG.

The Bayesian network has mainly two components:


o Causal Component

o Actual numbers
Each node in the Bayesian network has condition probability distribution P(Xi
|Parent(Xi) ), which determines the effect of the parent on that node.
Bayesian network is based on Joint probability distribution and conditional probability.
So let's first understand the joint probability distribution:
Joint probability distribution:

If we have variables x1, x2, x3,. , xn, then the probabilities of a different combination of
x1, x2, x3.. xn, are known as Joint probability distribution.
P[x1, x2, x3,....., xn], it can be written as the following way in terms of the joint
probability distribution.

= P[x1| x2, x3,....., xn]P[x2, x3,. , xn]

= P[x1| x2, x3,....., xn]P[x2|x3,....., xn]. P[xn-1|xn]P[xn].

In general for each variable Xi, we can write the equation as:

P(Xi|Xi-1,..... , X1) = P(Xi |Parents(Xi ))


Artificial Intelligence Module -III

Example 1:

✓ Suppose that there are two events which could cause grass to be wet: either the sprinkler is on it’s
raining that the rain had a direct effect on the use of the sprinkler(namely that when it rains,the
sprinkler is usually not turned on).
✓ Let’s return to the example of the sprinkler, rain, and grass. we construct a directed acyclic
graph (DAG) that represents causality relationships among variables.The variables in such a
graph may be propositional (values TRUE or FALSE) or they may be variables that take on values
of some other type e.g., a specific disease, a body temperature, or a reading taken by some other
diagnostic device.
✓ A Bayes net treats the above problem with the tool: a graphical model and a few tables. For the
above example, one possible representation is

Sprinkler Rain

Rain
T F
0.2 0.8
Grass wet

Sprinkler Grass wet


Rain T F Sprinkler Rain T F
T 0.4 0.6 F F 0.0 1.0
F 0.01 0.99 F T 0.8 0.2
T F 0.9 0.1
T T 0.99 0.01
Artificial Intelligence Module -III
✓ A DAG illustrates the causality relationships that occur among the nodes it
contains. In order to use it as a basis for probabilistic reasoning, however, we need
more information. In particular, we need to know, for each value of a parent node,
what evidence is provided about the values that the child node can take on. We can
state this in a table in which the conditional probabilities are provided. We show
such a table for our example in Figure above. For

example, from the table we see that the prior probability of the rainy season is
0.5. “Then, if it is the rainy season, the probability of rain on a given night is
0.9, if it is not, the probability is only 0.1.
➢ DEMPSTER - SHAFER THEORY
This theory was released because of following reason:-
✓ Bayesian theory is only concerned about single evidences.
✓ Bayesian probability cannot describe ignorance.
✓ DST is an evidence theory, it combines all possible outcomes of the problem. Hence
it is used to solve problems where there may be a chance that a different evidence
will lead to some different result.
✓ DST is a mathematical theory of evidence based on belief functions and plausible
reasoning. It is used to combine separate pieces of information (evidence) to
calculate the probability of an event.
✓ DST offers an alternative to traditional probabilistic theory for the mathematical
representation of uncertainty.
✓ DST can be regarded as, a more general approach to represent uncertainty than the
Bayesian approach.
Dempster-Shafer Model

✓ The idea is to allocate a number between 0 and 1 to indicate a degree of


belief on a proposal as in the probability framework.

✓ However, it is not considered a probability but a belief mass. The


distribution of masses is called basic belief assignment.

✓ In other words, in this formalism a degree of belief (referred as mass) is


represented as a belief function rather than a Bayesian probability
distribution.
Artificial Intelligence Module -III
Belief and Plausibility

➢ Shafer's framework allows for belief about propositions to be represented


as intervals, bounded by two values, belief (or support) and plausibility:

➢ belief ≤ plausibility

➢ Belief in a hypothesis is constituted by the sum of the masses of all

➢ sets enclosed by it (i.e. the sum of the masses of all subsets of the hypothesis).
It is the amount of belief that directly supports a given hypothesis at least in
part, forming a lower bound.

➢ Plausibility is 1 minus the sum of the masses of all sets whose intersection with
the hypothesis is empty. It is an upper bound on the possibility that the
hypothesis could possibly happen, up to that value, because there is only so
much evidence that contradicts that hypothesis.

FUZZY LOGIC
✓ Here, we take a different approach and briefly consider what happens if we make fundamental
changes to our idea of set membership and corresponding changes to our definitions of logical
operations.
✓ The motivation for fuzzy sets is provided by the need to represent such propositions as:

John is very tall.


Mary is slightly ill.
Sue and Linda are close friends.
Exceptions to the rule are nearly impossible.
Most Frenchmen are not very tall.

✓ Fuzzy set theory allows us to represent set membership as a possibility


distribution, such as the ones shown in Figure (a) below. For the set of tall people
and the set of very tall people. Notice how this contrasts with the standard
Boolean definition for tall people shown in Figure
(b) below. In the latter, one is either tall or not and there must be a specific height
Artificial Intelligence Module -III
that defines the boundary. The same is true for very tall. In the former, one’s
tallness increases with one’s height until the value of 1 is reached.

Figure: Fuzzy versus Conventional Set Membership


Artificial Intelligence Module -III


Artificial Intelligence Module -III
Artificial Intelligence Module -III
Artificial Intelligence Module -III
Artificial Intelligence Module -III
Artificial Intelligence Module -III
Artificial Intelligence Module -III
Artificial Intelligence Module -III
Artificial Intelligence Module -III
Artificial Intelligence Module -III

-
Artificial Intelligence Module -III
Artificial Intelligence Module -III
Artificial Intelligence Module -III
Artificial Intelligence Module -III

You might also like