0% found this document useful (0 votes)
19 views101 pages

Ai Full Notes

Artificial Intelligence (AI) is the science of creating intelligent machines that can think and act like humans, utilizing knowledge from various fields such as computer science and psychology. The goals of AI include creating expert systems, replicating human intelligence, and solving complex tasks, with applications in gaming, natural language processing, and robotics. Despite its advantages like high accuracy and speed, AI also faces challenges such as high costs, lack of emotional understanding, and dependency on technology.

Uploaded by

Mahima
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views101 pages

Ai Full Notes

Artificial Intelligence (AI) is the science of creating intelligent machines that can think and act like humans, utilizing knowledge from various fields such as computer science and psychology. The goals of AI include creating expert systems, replicating human intelligence, and solving complex tasks, with applications in gaming, natural language processing, and robotics. Despite its advantages like high accuracy and speed, AI also faces challenges such as high costs, lack of emotional understanding, and dependency on technology.

Uploaded by

Mahima
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 101

Introduction to Artificial Intelligence

What is Artificial Intelligence?

According to the father of Artificial Intelligence, John McCarthy, it is “The science and engineering
of making intelligent machines, especially intelligent computer programs”.
Artificial Intelligence is a way of making a computer, a computer-controlled robot, or a software
think intelligently, in the similar manner the intelligent humans think.
AI is accomplished by studying how human brain thinks, and how humans learn, decide, and work
while trying to solve a problem, and then using the outcomes of this study as a basis of developing
intelligent software and systems.

Philosophy of AI

While exploiting the power of the computer systems, the curiosity of human, lead him to
wonder, “Can a machine think and behave like humans do?”
Thus, the development of AI started with the intention of creating similar intelligence in machines
that we find and regard high in humans.

Goals of AI

 To Create Expert Systems − The systems which exhibit intelligent behavior, learn,
demonstrate, explain, and advice its users.
 To Implement Human Intelligence in Machines − Creating systems that understand, think,
learn, and behave like humans.
 Replicate human intelligence
 Solve Knowledge-intensive tasks
 An intelligent connection of perception and action
 Building a machine which can perform tasks that requires human intelligence such as:
o Proving a theorem
o Playing chess
o Plan some surgical operation
o Driving a car in traffic
 Creating some system which can exhibit intelligent behavior, learn new things by itself,
demonstrate, explain, and can advise to its user.
Artificial Intelligence is composed of two words Artificial and Intelligence, where Artificial
defines "man-made," and intelligence defines "thinking power", hence AI means "a man-made
thinking power."

So, we can define AI as:

"It is a branch of computer science by which we can create intelligent machines which can behave
like a human, think like humans, and able to make decisions."

Artificial Intelligence exists when a machine can have human based skills such as learning, reasoning,
and solving problems

With Artificial Intelligence you do not need to preprogram a machine to do some work, despite that
you can create a machine with programmed algorithms which can work with own intelligence, and
that is the awesomeness of AI.

It is believed that AI is not a new technology, and some people says that as per Greek myth, there were
Mechanical men in early days which can work and behave like humans.

Why Artificial Intelligence?

Before Learning about Artificial Intelligence, we should know that what is the importance of AI and
why should we learn it. Following are some main reasons to learn about AI:

o With the help of AI, you can create such software or devices which can solve real-world
problems very easily and with accuracy such as health issues, marketing, traffic issues, etc.
o With the help of AI, you can create your personal virtual Assistant, such as Cortana, Google
Assistant, Siri, etc.
o With the help of AI, you can build such Robots which can work in an environment where
survival of humans can be at risk.
o AI opens a path for other new technologies, new devices, and new Opportunities.
What Contributes to AI?

Artificial intelligence is a science and technology based on disciplines such as Computer Science,
Biology, Psychology, Linguistics, Mathematics, and Engineering. A major thrust of AI is in the
development of computer functions associated with human intelligence, such as reasoning, learning,
and problem solving.
Out of the following areas, one or multiple areas can contribute to build an intelligent system.

Programming Without and With AI


The programming without and with AI is different in following ways −

Programming Without AI Programming With AI

A computer program without AI can A computer program with AI can answer


answer the specific questions it is meant the generic questions it is meant to solve.
to solve.

AI programs can absorb new modifications by


putting highly independent pieces of information
Modification in the program leads to
together. Hence you can modify even a minute
change in its structure.
piece of information of program without affecting
its structure.

Modification is not quick and easy. It


may lead to affecting the program Quick and Easy program modification.
adversely.

What is AI Technique?
In the real world, the knowledge has some unwelcomed properties −

 Its volume is huge, next to unimaginable.


 It is not well-organized or well-formatted.
 It keeps changing constantly.
AI Technique is a manner to organize and use the knowledge efficiently in such a way that −

 It should be perceivable by the people who provide it.


 It should be easily modifiable to correct errors.
 It should be useful in many situations though it is incomplete or inaccurate.
AI techniques elevate the speed of execution of the complex program it is equipped with.

Applications of AI

AI has been dominant in various fields such as −


 Gaming − AI plays crucial role in strategic games such as chess, poker, tic-tac-toe, etc., where
machine can think of large number of possible positions based on heuristic knowledge.
 Natural Language Processing − It is possible to interact with the computer that understands
natural language spoken by humans.
 Expert Systems − There are some applications which integrate machine, software, and special
information to impart reasoning and advising. They provide explanation and advice to the
users.
 Vision Systems − These systems understand, interpret, and comprehend visual input on the
computer. For example,
o A spying aeroplane takes photographs, which are used to figure out spatial information
or map of the areas.
o Doctors use clinical expert system to diagnose the patient.
o Police use computer software that can recognize the face of criminal with the stored
portrait made by forensic artist.
 Speech Recognition − Some intelligent systems are capable of hearing and comprehending
the language in terms of sentences and their meanings while a human talks to it. It can handle
different accents, slang words, noise in the background, change in human’s noise due to cold,
etc.
 Handwriting Recognition − The handwriting recognition software reads the text written on
paper by a pen or on screen by a stylus. It can recognize the shapes of the letters and convert
it into editable text.
 Intelligent Robots − Robots are able to perform the tasks given by a human. They have
sensors to detect physical data from the real world such as light, heat, temperature, movement,
sound, bump, and pressure. They have efficient processors, multiple sensors and huge
memory, to exhibit intelligence. In addition, they are capable of learning from their mistakes
and they can adapt to the new environment.

History of AI
Here is the history of AI during 20th century −

Year Milestone / Innovation

1923 Karel Čapek play named “Rossum's Universal Robots” (RUR) opens in London, first use
of the word "robot" in English.

1943 Foundations for neural networks laid.

1945 Isaac Asimov, a Columbia University alumni, coined the term Robotics.

Alan Turing introduced Turing Test for evaluation of intelligence and published Computing
1950 Machinery and Intelligence. Claude Shannon published Detailed Analysis of Chess
Playing as a search.

John McCarthy coined the term Artificial Intelligence. Demonstration of the first running
1956
AI program at Carnegie Mellon University.

1958 John McCarthy invents LISP programming language for AI.

Danny Bobrow's dissertation at MIT showed that computers can understand natural
1964
language well enough to solve algebra word problems correctly.

Joseph Weizenbaum at MIT built ELIZA, an interactive problem that carries on a dialogue
1965
in English.

Scientists at Stanford Research Institute Developed Shakey, a robot, equipped with


1969
locomotion, perception, and problem solving.

The Assembly Robotics group at Edinburgh University built Freddy, the Famous Scottish
1973
Robot, capable of using vision to locate and assemble models.
1979 The first computer-controlled autonomous vehicle, Stanford Cart, was built.

1985 Harold Cohen created and demonstrated the drawing program, Aaron.

Major advances in all areas of AI −

 Significant demonstrations in machine learning


 Case-based reasoning
 Multi-agent planning
1990
 Scheduling
 Data mining, Web Crawler
 natural language understanding and translation
 Vision, Virtual Reality
 Games

1997 The Deep Blue Chess Program beats the then world chess champion, Garry Kasparov.

Interactive robot pets become commercially available. MIT displays Kismet, a robot with a
2000 face that expresses emotions. The robot Nomad explores remote regions of Antarctica and
locates meteorites.

Advantages of Artificial Intelligence

Following are some main advantages of Artificial Intelligence:

o High Accuracy with less errors: AI machines or systems are prone to less errors and high
accuracy as it takes decisions as per pre-experience or information.
o High-Speed: AI systems can be of very high-speed and fast-decision making, because of that
AI systems can beat a chess champion in the Chess game.
o High reliability: AI machines are highly reliable and can perform the same action multiple
times with high accuracy.
o Useful for risky areas: AI machines can be helpful in situations such as defusing a bomb,
exploring the ocean floor, where to employ a human can be risky.
o Digital Assistant: AI can be very useful to provide digital assistant to the users such as AI
technology is currently used by various E-commerce websites to show the products as per
customer requirement.
o Useful as a public utility: AI can be very useful for public utilities such as a self-driving car
which can make our journey safer and hassle-free, facial recognition for security purpose,
Natural language processing to communicate with the human in human-language, etc.

Disadvantages of Artificial Intelligence

Every technology has some disadvantages, and the same goes for Artificial intelligence. Being so
advantageous technology still, it has some disadvantages which we need to keep in our mind while
creating an AI system. Following are the disadvantages of AI:

o High Cost: The hardware and software requirement of AI is very costly as it requires lots of
maintenance to meet current world requirements.
o Can't think out of the box: Even we are making smarter machines with AI, but still they
cannot work out of the box, as the robot will only do that work for which they are trained, or
programmed.
o No feelings and emotions: AI machines can be an outstanding performer, but still it does not
have the feeling so it cannot make any kind of emotional attachment with human, and may
sometime be harmful for users if the proper care is not taken.
o Increase dependency on machines: With the increment of technology, people are getting
more dependent on devices and hence they are losing their mental capabilities.
o No Original Creativity: As humans are so creative and can imagine some new ideas but still
AI machines cannot beat this power of human intelligence and cannot be creative and
imaginative.
o

Task Classification of AI
The domain of AI is classified into Formal tasks, Mundane tasks, and Expert tasks.
Task Domains of Artificial Intelligence

Mundane (Ordinary) Tasks Formal Tasks Expert Tasks

Perception  Mathematics  Engineering


 Geometry  Fault Finding
 Computer Vision  Logic  Manufacturing
 Speech, Voice  Integration and  Monitoring
Differentiation

Natural Language Games Scientific Analysis


Processing
 Go
 Understanding  Chess (Deep Blue)
 Language Generation  Ckeckers
 Language Translation

Common Sense Verification Financial Analysis

Reasoning Theorem Proving Medical Diagnosis

Planing Creativity

Robotics

 Locomotive

Humans learn mundane (ordinary) tasks since their birth. They learn by perception, speaking, using
language, and locomotives. They learn Formal Tasks and Expert Tasks later, in that order.

For humans, the mundane tasks are easiest to learn. The same was considered true before trying to
implement mundane tasks in machines. Earlier, all work of AI was concentrated in the mundane task
domain.

Later, it turned out that the machine requires more knowledge, complex knowledge representation,
and complicated algorithms for handling mundane tasks. This is the reason why AI work is more
prospering in the Expert Tasks domain now, as the expert task domain needs expert knowledge
without common sense, which can be easier to represent and handle.
Types of Artificial Intelligence:

Artificial Intelligence can be divided in various types, there are mainly two types of main
categorization which are based on capabilities and based on functionally of AI. Following is flow
diagram which explain the types of AI.

AI type-1: Based on Capabilities

1. Weak AI or Narrow AI:


o Narrow AI is a type of AI which is able to perform a dedicated task with intelligence. The most
common and currently available AI is Narrow AI in the world of Artificial Intelligence.
o Narrow AI cannot perform beyond its field or limitations, as it is only trained for one specific
task. Hence it is also termed as weak AI. Narrow AI can fail in unpredictable ways if it goes
beyond its limits.
o Apple Siriis a good example of Narrow AI, but it operates with a limited pre-defined range of
functions.
o IBM's Watson supercomputer also comes under Narrow AI, as it uses an Expert system
approach combined with Machine learning and natural language processing.
o Some Examples of Narrow AI are playing chess, purchasing suggestions on e-commerce site,
self-driving cars, speech recognition, and image recognition.
2. General AI:
o General AI is a type of intelligence which could perform any intellectual task with efficiency
like a human.
o The idea behind the general AI to make such a system which could be smarter and think like a
human by its own.
o Currently, there is no such system exist which could come under general AI and can perform
any task as perfect as a human.
o The worldwide researchers are now focused on developing machines with General AI.
o As systems with general AI are still under research, and it will take lots of efforts and time to
develop such systems.

3. Super AI:
o Super AI is a level of Intelligence of Systems at which machines could surpass human
intelligence, and can perform any task better than human with cognitive properties. It is an
outcome of general AI.
o Some key characteristics of strong AI include capability include the ability to think, to
reason,solve the puzzle, make judgments, plan, learn, and communicate by its own.
o Super AI is still a hypothetical concept of Artificial Intelligence. Development of such systems
in real is still world changing task.

Artificial Intelligence type-2: Based on functionality


1. Reactive Machines
o Purely reactive machines are the most basic types of Artificial Intelligence.
o Such AI systems do not store memories or past experiences for future actions.
o These machines only focus on current scenarios and react on it as per possible best action.
o IBM's Deep Blue system is an example of reactive machines.
o Google's AlphaGo is also an example of reactive machines.

2. Limited Memory
o Limited memory machines can store past experiences or some data for a short period of time.
o These machines can use stored data for a limited time period only.
o Self-driving cars are one of the best examples of Limited Memory systems. These cars can
store recent speed of nearby cars, the distance of other cars, speed limit, and other
information to navigate the road.

3. Theory of Mind
o Theory of Mind AI should understand the human emotions, people, beliefs, and be able to
interact socially like humans.
o This type of AI machines are still not developed, but researchers are making lots of efforts
and improvement for developing such AI machines.

4. Self-Awareness
o Self-awareness AI is the future of Artificial Intelligence. These machines will be super
intelligent, and will have their own consciousness, sentiments, and self-awareness.
o These machines will be smarter than human mind.
o Self-Awareness AI does not exist in reality still and it is a hypothetical concept.

AI: Agents and Environment


An AI system is composed of an agent and its environment. The agents act in their
environment. The environment may contain other agents.

What are Agent and Environment?


An agent is anything that can perceive its environment through sensors and acts upon that
environment through effectors.
 A human agent has sensory organs such as eyes, ears, nose, tongue and skin parallel to the
sensors, and other organs such as hands, legs, mouth, for effectors.
 A robotic agent replaces cameras and infrared range finders for the sensors, and various
motors and actuators for effectors.
 A software agent has encoded bit strings as its programs and actions.

Agent Terminology
 Performance Measure of Agent − It is the criteria, which determines how successful an agent
is.
 Behavior of Agent − It is the action that agent performs after any given sequence of percepts.
 Percept − It is agent’s perceptual inputs at a given instance.
 Percept Sequence − It is the history of all that an agent has perceived till date.
 Agent Function − It is a map from the precept sequence to an action.

Rationality

Rationality is nothing but status of being reasonable, sensible, and having good sense of judgment.
Rationality is concerned with expected actions and results depending upon what the agent has
perceived. Performing actions with the aim of obtaining useful information is an important part of
rationality.

What is Ideal Rational Agent?


An ideal rational agent is the one, which is capable of doing expected actions to maximize its
performance measure, on the basis of −

 Its percept sequence


 Its built-in knowledge base
Rationality of an agent depends on the following −
 The performance measures, which determine the degree of success.
 Agent’s Percept Sequence till now.
 The agent’s prior knowledge about the environment.
 The actions that the agent can carry out.
A rational agent always performs right action, where the right action means the action that causes the
agent to be most successful in the given percept sequence. The problem the agent solves is
characterized by Performance Measure, Environment, Actuators, and Sensors (PEAS).

The Structure of Intelligent Agents


Agent’s structure can be viewed as −

 Agent = Architecture + Agent Program


 Architecture = the machinery that an agent executes on.
 Agent Program = an implementation of an agent function.
Simple Reflex Agents

 They choose actions only based on the current percept.


 They are rational only if a correct decision is made only on the basis of current precept.
 Their environment is completely observable.
Condition-Action Rule − It is a rule that maps a state (condition) to an action.

Model Based Reflex Agents


They use a model of the world to choose their actions. They maintain an internal state.
Model − knowledge about “how the things happen in the world”.
Internal State − It is a representation of unobserved aspects of current state depending on percept
history.
Updating the state requires the information about −
 How the world evolves.
 How the agent’s actions affect the world.

Goal Based Agents


They choose their actions in order to achieve goals. Goal-based approach is more flexible than reflex
agent since the knowledge supporting a decision is explicitly modeled, thereby allowing for
modifications.
Goal − It is the description of desirable situations.

Utility Based Agents


They choose actions based on a preference (utility) for each state.
Goals are inadequate when −
 There are conflicting goals, out of which only few can be achieved.
 Goals have some uncertainty of being achieved and you need to weigh likelihood of success
against the importance of a goal.

The Nature of Environments

Some programs operate in the entirely artificial environment confined to keyboard input, database,
computer file systems and character output on a screen.
In contrast, some software agents (software robots or softbots) exist in rich, unlimited softbots
domains. The simulator has a very detailed, complex environment. The software agent needs to
choose from a long array of actions in real time. A softbot designed to scan the online preferences of
the customer and show interesting items to the customer works in the real as well as
an artificial environment.
The most famous artificial environment is the Turing Test environment, in which one real and
other artificial agents are tested on equal ground. This is a very challenging environment as it is highly
difficult for a software agent to perform as well as a human.

Turing Test
The success of an intelligent behavior of a system can be measured with Turing Test.
Two persons and a machine to be evaluated participate in the test. Out of the two persons, one plays
the role of the tester. Each of them sits in different rooms. The tester is unaware of who is machine
and who is a human. He interrogates the questions by typing and sending them to both intelligences,
to which he receives typed responses.
This test aims at fooling the tester. If the tester fails to determine machine’s response from the human
response, then the machine is said to be intelligent.

Properties of Environment
The environment has multifold properties −
 Discrete / Continuous − If there are a limited number of distinct, clearly defined, states of the
environment, the environment is discrete (For example, chess); otherwise it is continuous (For
example, driving).
 Observable / Partially Observable − If it is possible to determine the complete state of the
environment at each time point from the percepts it is observable; otherwise it is only partially
observable.
 Static / Dynamic − If the environment does not change while an agent is acting, then it is
static; otherwise it is dynamic.
 Single agent / Multiple agents − The environment may contain other agents which may be of
the same or different kind as that of the agent.
 Accessible / Inaccessible − If the agent’s sensory apparatus can have access to the complete
state of the environment, then the environment is accessible to that agent.
 Deterministic / Non-deterministic − If the next state of the environment is completely
determined by the current state and the actions of the agent, then the environment is
deterministic; otherwise it is non-deterministic.
 Episodic / Non-episodic − In an episodic environment, each episode consists of the agent
perceiving and then acting. The quality of its action depends just on the episode itself.
Subsequent episodes do not depend on the actions in the previous episodes. Episodic
environments are much simpler because the agent does not need to think ahead.

Problem Solving

Problem:

A problem, which can be caused for different reasons, and, if solvable, can usually
be solved in a number of different ways, is defined in a number of different ways.

To build a system or to solve a particular problem we need to do four things:


 Define the problem precisely. This definition must include precise specification of what the
initial situation will be as well as what final situations constitute acceptable solutions to the
problem
 Analyze the problem
 Isolate and represent the task knowledge that is necessary to solve the problem
 Choose the best solving technique and apply it to the particular problem.

Defining the Problem as a State Space Search

Problem solving = Searching for a goal state


It is a structured method for solving an unstructured problem. This approach consists of number of
states. The starting of the problem is “Initial State” of the problem. The last point in the problem is
called a “Goal State” or “Final State” of the problem.

State space is a set of legal positions, starting at the initial state, using the set of rules
to move from one state to another and attempting to end up in a goal state.

Searching Algorithm
Search Algorithm Terminologies:

o Search: Searching is a step by step procedure to solve a search-problem in a given search


space. A search problem can have three main factors:
a. Search Space: Search space represents a set of possible solutions, which a system may
have.
b. Start State: It is a state from where agent begins the search.
c. Goal test: It is a function which observe the current state and returns whether the goal
state is achieved or not.
o Search tree: A tree representation of search problem is called Search tree. The root of the
search tree is the root node which is corresponding to the initial state.
o Actions: It gives the description of all the available actions to the agent.
o Transition model: A description of what each action do, can be represented as a transition
model.
o Path Cost: It is a function which assigns a numeric cost to each path.
o Solution: It is an action sequence which leads from the start node to the goal node.
o Optimal Solution: If a solution has the lowest cost among all solutions.

Properties of Search Algorithms:

 Completeness: A search algorithm is said to be complete if it guarantees to return a solution


if at least any solution exists for any random input.
 Optimality: If a solution found for an algorithm is guaranteed to be the best solution (lowest
path cost) among all other solutions, then such a solution for is said to be an optimal solution.
 Time Complexity: Time complexity is a measure of time for an algorithm to complete its
task.
 Space Complexity: It is the maximum storage space required at any point during the search,
as the complexity of the problem.

Types of search algorithms

Based on the search problems we can classify the search algorithms into uninformed (Blind
search) search and informed search (Heuristic search) algorithms.

AO* Search

Uninformed/Blind Search:

The uninformed search does not contain any domain knowledge such as closeness, the location of the
goal. It operates in a brute-force way as it only includes information about how to traverse the tree and
how to identify leaf and goal nodes. Uninformed search applies a way in which search tree is searched
without any information about the search space like initial state operators and test for the goal, so it is
also called blind search.It examines each node of the tree until it achieves the goal node.

It can be divided into five main types:

o Breadth-first search
o Uniform cost search
o Depth-first search
o Iterative deepening depth-first search
o Bidirectional Search

Informed Search

Informed search algorithms use domain knowledge. In an informed search, problem information is
available which can guide the search. Informed search strategies can find a solution more efficiently
than an uninformed search strategy. Informed search is also called a Heuristic search.

A heuristic is a way which might not always be guaranteed for best solutions but guaranteed to find a
good solution in reasonable time.

Informed search can solve much complex problem which could not be solved in another way.

An example of informed search algorithms is a traveling salesman problem.

1. Greedy Search
2. A* Search
3. AO* Search

Uninformed Search Algorithms


Uninformed search is a class of general-purpose search algorithms which operates in brute force-
way. Uninformed search algorithms do not have additional information about state or search space
other than how to traverse the tree, so it is also called blind search.

Following are the various types of uninformed search algorithms:

1. Breadth-first Search
2. Depth-first Search
3. Depth-limited Search
4. Iterative deepening depth-first search
5. Uniform cost search
6. Bidirectional Search

1. Breadth-first Search:

o Breadth-first search is the most common search strategy for traversing a tree or graph. This
algorithm searches breadthwise in a tree or graph, so it is called breadth-first search.
o BFS algorithm starts searching from the root node of the tree and expands all successor node
at the current level before moving to nodes of next level.
o The breadth-first search algorithm is an example of a general-graph search algorithm.
o Breadth-first search implemented using FIFO queue data structure.

Advantages:

o BFS will provide a solution if any solution exists.

o If there are more than one solutions for a given problem, then BFS will provide the
minimal solution which requires the least number of steps.

Disadvantages:

o It requires lots of memory since each level of the tree must be saved into memory
to expand the next level.

o BFS needs lots of time if the solution is far away from the root node.

Example:

In the below tree structure, we have shown the traversing of the tree using BFS algorithm from the
root node S to goal node K. BFS search algorithm traverse in layers, so it will follow the path which
is shown by the dotted arrow, and the traversed path will be:

1. S---> A--->B---->C--->D---->G--->H--->E---->F---->I---->K
Time Complexity: Time Complexity of BFS algorithm can be obtained by the number of nodes
traversed in BFS until the shallowest Node. Where the d= depth of shallowest solution and b is a node
at every state.

T (b) = 1+b2+b3+.......+ bd= O (bd)

Space Complexity: Space complexity of BFS algorithm is given by the Memory size of frontier which
is O(bd).

Completeness: BFS is complete, which means if the shallowest goal node is at some finite depth, then
BFS will find a solution.

Optimality: BFS is optimal if path cost is a non-decreasing function of the depth of the node.

2. Depth-first Search:

o Depth-first search isa recursive algorithm for traversing a tree or graph data structure.
o It is called the depth-first search because it starts from the root node and follows each path to
its greatest depth node before moving to the next path.
o DFS uses a stack data structure for its implementation.
o The process of the DFS algorithm is similar to the BFS algorithm.

Note: Backtracking is an algorithm technique for finding all possible solutions using recursion.

Advantage:

o DFS requires very less memory as it only needs to store a stack of the nodes on the path from
root node to the current node.
o It takes less time to reach to the goal node than BFS algorithm (if it traverses in the right
path).

Disadvantage:

o There is the possibility that many states keep re-occurring, and there is no guarantee of
finding the solution.
o DFS algorithm goes for deep down searching and sometime it may go to the infinite loop.

Example:

In the below search tree, we have shown the flow of depth-first search, and it will follow the order
as:

Root node--->Left node ----> right node.

It will start searching from root node S, and traverse A, then B, then D and E, after traversing E, it
will backtrack the tree as E has no other successor and still goal node is not found. After
backtracking it will traverse node C and then G, and here it will terminate as it found goal node.
Completeness: DFS search algorithm is complete within finite state space as it will expand every
node within a limited search tree.

Time Complexity: Time complexity of DFS will be equivalent to the node traversed by the
algorithm. It is given by:

T(n)= 1+ n2+ n3 +.........+ nm=O(nm)

Where, m= maximum depth of any node and this can be much larger than d (Shallowest
solution depth)

Space Complexity: DFS algorithm needs to store only single path from the root node, hence space
complexity of DFS is equivalent to the size of the fringe set, which is O(bm).

Optimal: DFS search algorithm is non-optimal, as it may generate a large number of steps or high
cost to reach to the goal node.

3. Depth-Limited Search Algorithm:

A depth-limited search algorithm is similar to depth-first search with a predetermined limit. Depth-
limited search can solve the drawback of the infinite path in the Depth-first search. In this algorithm,
the node at the depth limit will treat as it has no successor nodes further.

Depth-limited search can be terminated with two Conditions of failure:


o Standard failure value: It indicates that problem does not have any solution.
o Cutoff failure value: It defines no solution for the problem within a given depth limit.

Advantages:

Depth-limited search is Memory efficient.

Disadvantages:

o Depth-limited search also has a disadvantage of incompleteness.


o It may not be optimal if the problem has more than one solution.

Example:

Completeness: DLS search algorithm is complete if the solution is above the depth-limit.

Time Complexity: Time complexity of DLS algorithm is O(bℓ).

Space Complexity: Space complexity of DLS algorithm is O(b×ℓ).

Optimal: Depth-limited search can be viewed as a special case of DFS, and it is also not optimal even
if ℓ>d.

4. Uniform-cost Search Algorithm:

Uniform-cost search is a searching algorithm used for traversing a weighted tree or graph. This
algorithm comes into play when a different cost is available for each edge. The primary goal of the
uniform-cost search is to find a path to the goal node which has the lowest cumulative cost. Uniform-
cost search expands nodes according to their path costs form the root node. It can be used to solve any
graph/tree where the optimal cost is in demand. A uniform-cost search algorithm is implemented by
the priority queue. It gives maximum priority to the lowest cumulative cost. Uniform cost search is
equivalent to BFS algorithm if the path cost of all edges is the same.

Advantages:

o Uniform cost search is optimal because at every state the path with the least cost is chosen.

Disadvantages:

o It does not care about the number of steps involve in searching and only concerned about path
cost. Due to which this algorithm may be stuck in an infinite loop.

Example:

Completeness:

Uniform-cost search is complete, such as if there is a solution, UCS will find it.

Time Complexity:

Let C* is Cost of the optimal solution, and ε is each step to get closer to the goal node. Then the
number of steps is = C*/ε+1. Here we have taken +1, as we start from state 0 and end to C*/ε.

Hence, the worst-case time complexity of Uniform-cost search is O(b1 + [C*/ε])/.

Space Complexity:
The same logic is for space complexity so, the worst-case space complexity of Uniform-cost search
is O(b1 + [C*/ε]).

Optimal:

Uniform-cost search is always optimal as it only selects a path with the lowest path cost.

5. Iterative deepening depth-first Search:

The iterative deepening algorithm is a combination of DFS and BFS algorithms. This search algorithm
finds out the best depth limit and does it by gradually increasing the limit until a goal is found.

This algorithm performs depth-first search up to a certain "depth limit", and it keeps increasing the
depth limit after each iteration until the goal node is found.

This Search algorithm combines the benefits of Breadth-first search's fast search and depth-first
search's memory efficiency.

The iterative search algorithm is useful uninformed search when search space is large, and depth of
goal node is unknown.

Advantages:

o It combines the benefits of BFS and DFS search algorithm in terms of fast search and memory
efficiency.

Disadvantages:

o The main drawback of IDDFS is that it repeats all the work of the previous phase.

Example:

Following tree structure is showing the iterative deepening depth-first search. IDDFS algorithm
performs various iterations until it does not find the goal node. The iteration performed by the
algorithm is given as:
1'st Iteration-----> A
2'nd Iteration----> A, B, C
3'rd Iteration------>A, B, D, E, C, F, G
4'th Iteration------>A, B, D, H, I, E, C, F, K, G
In the fourth iteration, the algorithm will find the goal node.

Completeness:

This algorithm is complete is if the branching factor is finite.

Time Complexity:

Let's suppose b is the branching factor and depth is d then the worst-case time complexity is O(bd).

Space Complexity:

The space complexity of IDDFS will be O(bd).

Optimal:

IDDFS algorithm is optimal if path cost is a non- decreasing function of the depth of the node.
6. Bidirectional Search Algorithm:

Bidirectional search algorithm runs two simultaneous searches, one form initial state called as forward-
search and other from goal node called as backward-search, to find the goal node. Bidirectional search
replaces one single search graph with two small subgraphs in which one starts the search from an initial
vertex and other starts from goal vertex. The search stops when these two graphs intersect each other.

Bidirectional search can use search techniques such as BFS, DFS, DLS, etc.

Advantages:

o Bidirectional search is fast.


o Bidirectional search requires less memory

Disadvantages:

o Implementation of the bidirectional search tree is difficult.


o In bidirectional search, one should know the goal state in advance.

Example:

In the below search tree, bidirectional search algorithm is applied. This algorithm divides one
graph/tree into two sub-graphs. It starts traversing from node 1 in the forward direction and starts
from goal node 16 in the backward direction.

The algorithm terminates at node 9 where two searches meet.

Completeness: Bidirectional Search is complete if we use BFS in both searches.

Time Complexity: Time complexity of bidirectional search using BFS is O(bd).


Space Complexity: Space complexity of bidirectional search is O(bd).

Optimal: Bidirectional search is Optimal.

Informed (Heuristic) Search Algorithms

To solve large problems with large number of possible states, problem-specific knowledge needs to
be added to increase the efficiency of search algorithms.

The uninformed search algorithms which looked through search space for all possible solutions of the
problem without having any additional knowledge about search space. But informed search algorithm
contains an array of knowledge such as how far we are from the goal, path cost, how to reach to goal
node, etc. This knowledge help agents to explore less to the search space and find more efficiently the
goal node.

The informed search algorithm is more useful for large search space. Informed search algorithm uses
the idea of heuristic, so it is also called Heuristic search.

Heuristics function: Heuristic is a function which is used in Informed Search, and it finds the most
promising path. It takes the current state of the agent as its input and produces the estimation of how
close agent is from the goal. The heuristic method, however, might not always give the best solution,
but it guaranteed to find a good solution in reasonable time. Heuristic function estimates how close a
state is to the goal. It is represented by h(n), and it calculates the cost of an optimal path between the
pair of states. The value of the heuristic function is always positive.

Pure Heuristic Search


It expands nodes in the order of their heuristic values. It creates two lists, a closed list for the already
expanded nodes and an open list for the created but unexpanded nodes.
In each iteration, a node with a minimum heuristic value is expanded, all its child nodes are created
and placed in the closed list. Then, the heuristic function is applied to the child nodes and they are
placed in the open list according to their heuristic value. The shorter paths are saved and the longer
ones are disposed.

In the informed search we will discuss three main algorithms which are given below:

o Best First Search Algorithm(Greedy search)


o A* Search Algorithm
o AO* Search Algorithm
1.) Best-first Search Algorithm (Greedy Search):

Greedy best-first search algorithm always selects the path which appears best at that moment. It is the
combination of depth-first search and breadth-first search algorithms. It uses the heuristic function and
search. Best-first search allows us to take the advantages of both algorithms. With the help of best-
first search, at each step, we can choose the most promising node. In the best first search algorithm,
we expand the node which is closest to the goal node and the closest cost is estimated by heuristic
function, i.e.

f(n)= g(n).

Were, h(n)= estimated cost from node n to the goal.

The greedy best first algorithm is implemented by the priority queue.

Best first search algorithm:


o Step 1: Place the starting node into the OPEN list.
o Step 2: If the OPEN list is empty, Stop and return failure.
o Step 3: Remove the node n, from the OPEN list which has the lowest value of h(n), and
places it in the CLOSED list.
o Step 4: Expand the node n, and generate the successors of node n.
o Step 5: Check each successor of node n, and find whether any node is a goal node or not. If
any successor node is goal node, then return success and terminate the search, else proceed to
Step 6.
o Step 6: For each successor node, algorithm checks for evaluation function f(n), and then
check if the node has been in either OPEN or CLOSED list. If the node has not been in both
list, then add it to the OPEN list.
o Step 7: Return to Step 2.

Advantages:
o Best first search can switch between BFS and DFS by gaining the advantages of both the
algorithms.
o This algorithm is more efficient than BFS and DFS algorithms.

Disadvantages:
o It can behave as an unguided depth-first search in the worst case scenario.
o It can get stuck in a loop as DFS.
o This algorithm is not optimal.
Example:

Consider the below search problem, and we will traverse it using greedy best-first search. At each
iteration, each node is expanded using evaluation function f(n)=h(n) , which is given in the below
table.

In this search example, we are using two lists which are OPEN and CLOSED Lists. Following are
the iteration for traversing the above example.
Expand the nodes of S and put in the CLOSED list

 Initialization: Open [A, B], Closed [S]


 Iteration 1: Open [A], Closed [S, B]
 Iteration 2: Open [E, F, A], Closed [S, B]
: Open [E, A], Closed [S, B, F]
 Iteration 3: Open [I, G, E, A], Closed [S, B, F]
: Open [I, E, A], Closed [S, B, F, G]

Hence the final solution path will be: S----> B----->F----> G

Time Complexity: The worst case time complexity of Greedy best first search is O(bm).

Space Complexity: The worst case space complexity of Greedy best first search is O(bm). Where, m
is the maximum depth of the search space.

Complete: Greedy best-first search is also incomplete, even if the given state space is finite.

Optimal: Greedy best first search algorithm is not optimal.

2.) A* Search Algorithm:

A* search is the most commonly known form of best-first search. It uses heuristic function h(n), and
cost to reach the node n from the start state g(n). It has combined features of UCS and greedy best-
first search, by which it solve the problem efficiently. A* search algorithm finds the shortest path
through the search space using the heuristic function. This search algorithm expands less search tree
and provides optimal result faster. A* algorithm is similar to UCS except that it uses g(n)+h(n) instead
of g(n).

In A* search algorithm, we use search heuristic as well as the cost to reach the node. Hence we can
combine both costs as following, and this sum is called as a fitness number.

At each point in the search space, only those node is expanded which have the lowest value of f(n),
and the algorithm terminates when the goal node is found.
Algorithm of A* search:

Step1: Place the starting node in the OPEN list.

Step 2: Check if the OPEN list is empty or not, if the list is empty then return failure and stops.

Step 3: Select the node from the OPEN list which has the smallest value of evaluation function (g+h),
if node n is goal node then return success and stop, otherwise

Step 4: Expand node n and generate all of its successors, and put n into the closed list. For each
successor n', check whether n' is already in the OPEN or CLOSED list, if not then compute evaluation
function for n' and place into Open list.

Step 5: Else if node n' is already in OPEN and CLOSED, then it should be attached to the back pointer
which reflects the lowest g(n') value.

Step 6: Return to Step 2.

Advantages:
o A* search algorithm is the best algorithm than other search algorithms.
o A* search algorithm is optimal and complete.
o This algorithm can solve very complex problems.

Disadvantages:
o It does not always produce the shortest path as it mostly based on heuristics and approximation.
o A* search algorithm has some complexity issues.
o The main drawback of A* is memory requirement as it keeps all generated nodes in the
memory, so it is not practical for various large-scale problems.

Example:
In this example, we will traverse the given graph using the A* algorithm. The heuristic value of all
states is given in the below table so we will calculate the f(n) of each state using the formula f(n)= g(n)
+ h(n), where g(n) is the cost to reach any node from start state.
Here we will use OPEN and CLOSED list.
Solution:

 Initialization: {(S, 5)}


 Iteration1: {(S--> A, 4), (S-->G, 10)}
 Iteration2: {(S--> A-->C, 4), (S--> A-->B, 7), (S-->G, 10)}
 Iteration3: {(S--> A-->C--->G, 6), (S--> A-->C--->D, 11), (S--> A-->B, 7), (S-->G, 10)}
 Iteration 4 will give the final result, as S--->A--->C--->G it provides the optimal path with
cost 6.

Points to remember:

o A* algorithm returns the path which occurred first, and it does not search for all remaining
paths.
o The efficiency of A* algorithm depends on the quality of heuristic.
o A* algorithm expands all nodes which satisfy the condition f(n)<="" li="">

Complete: A* algorithm is complete as long as:

o Branching factor is finite.


o Cost at every action is fixed.

Optimal: A* search algorithm is optimal if it follows below two conditions:

o Admissible: the first condition requires for optimality is that h(n) should be an admissible
heuristic for A* tree search. An admissible heuristic is optimistic in nature.
o Consistency: Second required condition is consistency for only A* graph-search.

If the heuristic function is admissible, then A* tree search will always find the least cost path.

Time Complexity: The time complexity of A* search algorithm depends on heuristic function, and
the number of nodes expanded is exponential to the depth of solution d. So the time complexity is
O(b^d), where b is the branching factor.

Space Complexity: The space complexity of A* search algorithm is O(b^d)

3.) AO* Search Algorithm:

Our real-life situations can’t be exactly decomposed into either AND tree or OR tree but is always a
combination of both. So, we need an AO* algorithm where O stands for ‘ordered’. AO* algorithm
represents a part of the search graph that has been explicitly generated so far.
AO* algorithm is given as follows:

 Step-1: Create an initial graph with a single node (start node).


 Step-2: Transverse the graph following the current path, accumulating node that has not yet
been expanded or solved.
 Step-3: Select any of these nodes and explore it. If it has no successors then call this value-
FUTILITY else calculate f'(n) for each of the successors.
 Step-4: If f'(n)=0, then mark the node as SOLVED.
 Step-5: Change the value of f'(n) for the newly created node to reflect its successors by
backpropagation.
 Step-6: Whenever possible use the most promising routes, If a node is marked as SOLVED
then mark the parent node as SOLVED.
 Step-7: If the starting node is SOLVED or value is greater than FUTILITY then stop else repeat
from Step-2.

Hill Climbing Algorithm


o Hill climbing algorithm is a local search algorithm which continuously moves in the direction
of increasing elevation/value to find the peak of the mountain or best solution to the problem.
It terminates when it reaches a peak value where no neighbor has a higher value.
o Hill climbing algorithm is a technique which is used for optimizing the mathematical problems.
One of the widely discussed examples of Hill climbing algorithm is Traveling-salesman
Problem in which we need to minimize the distance traveled by the salesman.
o It is also called greedy local search as it only looks to its good immediate neighbor state and
not beyond that.
o A node of hill climbing algorithm has two components which are state and value.
o Hill Climbing is mostly used when a good heuristic is available.
o In this algorithm, we don't need to maintain and handle the search tree or graph as it only keeps
a single current state.

Features of Hill Climbing:

Following are some main features of Hill Climbing Algorithm:

o Generate and Test variant: Hill Climbing is the variant of Generate and Test method. The
Generate and Test method produce feedback which helps to decide which direction to move in
the search space.
o Greedy approach: Hill-climbing algorithm search moves in the direction which optimizes the
cost.
o No backtracking: It does not backtrack the search space, as it does not remember the previous
states.

State-space Diagram for Hill Climbing:

The state-space landscape is a graphical representation of the hill-climbing algorithm which is showing
a graph between various states of algorithm and Objective function/Cost.

On Y-axis we have taken the function which can be an objective function or cost function, and state-
space on the x-axis. If the function on Y-axis is cost then, the goal of search is to find the global
minimum and local minimum. If the function of Y-axis is Objective function, then the goal of the
search is to find the global maximum and local maximum.
Different regions in the state space landscape:

Local Maximum: Local maximum is a state which is better than its neighbor states, but there is also
another state which is higher than it.

Global Maximum: Global maximum is the best possible state of state space landscape. It has the
highest value of objective function.

Current state: It is a state in a landscape diagram where an agent is currently present.

Flat local maximum: It is a flat space in the landscape where all the neighbor states of current states
have the same value.

Shoulder: It is a plateau region which has an uphill edge.

Types of Hill Climbing Algorithm:

o Simple hill Climbing:


o Steepest-Ascent hill-climbing:
o Stochastic hill Climbing:

1. Simple Hill Climbing:

Simple hill climbing is the simplest way to implement a hill climbing algorithm. It only evaluates the
neighbor node state at a time and selects the first one which optimizes current cost and set it as
a current state. It only checks it's one successor state, and if it finds better than the current state, then
move else be in the same state. This algorithm has the following features:

o Less time consuming


o Less optimal solution and the solution is not guaranteed
Algorithm for Simple Hill Climbing:
o Step 1: Evaluate the initial state, if it is goal state then return success and Stop.
o Step 2: Loop Until a solution is found or there is no new operator left to apply.
o Step 3: Select and apply an operator to the current state.
o Step 4: Check new state:
a. If it is goal state, then return success and quit.
b. Else if it is better than the current state then assign new state as a current state.
c. Else if not better than the current state, then return to step2.
o Step 5: Exit.

2. Steepest-Ascent hill climbing:

The steepest-Ascent algorithm is a variation of simple hill climbing algorithm. This algorithm
examines all the neighboring nodes of the current state and selects one neighbor node which is closest
to the goal state. This algorithm consumes more time as it searches for multiple neighbors

Algorithm for Steepest-Ascent hill climbing:


o Step 1: Evaluate the initial state, if it is goal state then return success and stop, else make
current state as initial state.
o Step 2: Loop until a solution is found or the current state does not change.
a. Let SUCC be a state such that any successor of the current state will be better than it.
b. For each operator that applies to the current state:
a. Apply the new operator and generate a new state.
b. Evaluate the new state.
c. If it is goal state, then return it and quit, else compare it to the SUCC.
d. If it is better than SUCC, then set new state as SUCC.
e. If the SUCC is better than the current state, then set current state to SUCC.
o Step 5: Exit.

3. Stochastic hill climbing:

Stochastic hill climbing does not examine for all its neighbor before moving. Rather, this search
algorithm selects one neighbor node at random and decides whether to choose it as a current state or
examine another state.

Problems in Hill Climbing Algorithm:

1. Local Maximum: A local maximum is a peak state in the landscape which is better than each of its
neighboring states, but there is another state also present which is higher than the local maximum.
Solution: Backtracking technique can be a solution of the local maximum in state space landscape.
Create a list of the promising path so that the algorithm can backtrack the search space and explore
other paths as well.

2. Plateau: A plateau is the flat area of the search space in which all the neighbor states of the current
state contains the same value, because of this algorithm does not find any best direction to move. A
hill-climbing search might be lost in the plateau area.

Solution: The solution for the plateau is to take big steps or very little steps while searching, to solve
the problem. Randomly select a state which is far away from the current state so it is possible that the
algorithm could find non-plateau region.

3. Ridges: A ridge is a special form of the local maximum. It has an area which is higher than its
surrounding areas, but itself has a slope, and cannot be reached in a single move.

Solution: With the use of bidirectional search, or by moving in different directions, we can improve
this problem.
Simulated Annealing:

A hill-climbing algorithm which never makes a move towards a lower value guaranteed to be
incomplete because it can get stuck on a local maximum. And if algorithm applies a random walk, by
moving a successor, then it may complete but not efficient. Simulated Annealing is an algorithm
which yields both efficiency and completeness.

In mechanical term Annealing is a process of hardening a metal or glass to a high temperature then
cooling gradually, so this allows the metal to reach a low-energy crystalline state. The same process is
used in simulated annealing in which the algorithm picks a random move, instead of picking the best
move. If the random move improves the state, then it follows the same path. Otherwise, the algorithm
follows the path which has a probability of less than 1 or it moves downhill and chooses another path.

Constraint Satisfaction Problems


Previous search problems:

 Problems can be solved by searching in a space of states


 state is a “black box” – any data structure that supports successor function, heuristic function,
and goal test – problem-specific

Constraint satisfaction problem

 states and goal test conform to a standard, structured and simple representation
 general-purpose heuristic

A constraint satisfaction problem (or CSP) is defined by a set of variables- X1, X2, . . . , Xn, and a set
of constraints, C1, C2, . . . , Cm. Each variable Xi has a CONSTRAINTS nonempty domain Di of
possible values. Each constraint Ci involves some subset of the DOMAIN VALUES variables and
specifies the allowable combinations of values for that subset. A state of the problem is defined by an
assignment of valuesto some or all of the variables, {Xi = vi , Xj = ASSIGNMENT vj , . . .}. An
assignment that does not violate any constraints is called a consistent or legal CONSISTENT
assignment. A complete assignment is one in which every variable is mentioned, and a solution to a
CSP is a complete assignment that satisfies all the constraints. Some CSPs also require a solution that
maximizes an objective function.

CSP is defined by 3 components (X, D, C):


et of variables X, each Xi , with values from domain Di
a set of constraints C, each Ci involves some subset of the variables and specifies
the allowable combinations of values for that subset
s of a pair <scope, rel>, where scope is a tuple of variables and rel
is the relation, either represented explicitly or abstractly
X1 and X2 both have the domain {A, B}

Solution:
 Each state in a CSP is defined by an assignment of values to some or all of the variables
 An assignment that does not violate any constraints is called a consistent or legal assignment
 A complete assignment is one in which every variable is assigned
 A solution to a CSP is consistent and complete assignment
 Allows useful general-purpose algorithms with more power than standard search algorithms

Example: Map Colouring


 Constraint graph: nodes are variables, arcs are constraints
 Binary CSP: each constraint relates two variables
 CSP conforms to a standard pattern
o a set of variables with assigned values
o generic successor function and goal test
o generic heuristics
o reduce complexity
CSP as a Search Problem
Initial state:
 {} – all variables are unassigned
Successor function:
 a value is assigned to one of the unassigned variables with
no conflict
Goal test:
 a complete assignment
Path cost:
 a constant cost for each step
 Solution appears at depth n if there are n variables
 Depth-first or local search methods work well

CSP Solvers Can be Faster

 CSP solver can quickly eliminate large part of search space


 If {SA = blue}, Then 35 assignments can be reduced to 25 assignments, a
reduction of 87%
 In a CSP, if a partial assignment is not a solution, we can immediately discard
further refinements of it
Knowledge and Reasoning

Building a Knowledge Base

1. Prepositional Logic:

Propositional logic (PL) is the simplest form of logic where all the statements are made by
propositions. A proposition is a declarative statement which is either true or false. It is a
technique of knowledge representation in logical and mathematical form.

Example:

a) It is Sunday.
b) The Sun rises from West (False proposition)
c) 3+3= 7(False proposition)
d) 5 is a prime number.

Following are some basic facts about propositional logic:

o Propositional logic is also called Boolean logic as it works on 0 and 1.


o In propositional logic, we use symbolic variables to represent the logic, and we can use
any symbol for a representing a proposition, such A, B, C, P, Q, R, etc.
o Propositions can be either true or false, but it cannot be both.
o Propositional logic consists of an object, relations or function, and logical connectives.
o These connectives are also called logical operators.
o The propositions and connectives are the basic elements of the propositional logic.
o Connectives can be said as a logical operator which connects two sentences.
o A proposition formula which is always true is called tautology, and it is also called a
valid sentence.
o Statements which are questions, commands, or opinions are not propositions such as
"Where is Rohini", "How are you", "What is your name", are not propositions.
Syntax of propositional logic:

The syntax of propositional logic defines the allowable sentences for the knowledge
representation. There are two types of Propositions:

a) Atomic Propositions
b) Compound propositions

o Atomic Proposition: Atomic propositions are the simple propositions. It consists of a


single proposition symbol. These are the sentences which must be either true or false.

Example:

a) 2+2 is 4, it is an atomic proposition as it is a true fact.


b) "The Sun is cold" is also a proposition as it is a false fact.

o Compound proposition: Compound propositions are constructed by combining simpler or


atomic propositions, using parenthesis and logical connectives.

Example:

a) "It is raining today, and street is wet."


b) "Ankit is a doctor, and his clinic is in Mumbai."

Following is the summarized table for Propositional Logic Connectives:

Truth Table:
In propositional logic, we need to know the truth values of propositions in all possible scenarios. We
can combine all the possible combination with logical connectives, and the representation of these
combinations in a tabular format is called Truth table. Following are the truth table for all logical
connectives:
Truth table with three propositions:

We can build a proposition composing three propositions P, Q, and R. This truth table is made-
up of 8n Tuples as we have taken three proposition symbols.
Precedence of connectives:

Just like arithmetic operators, there is a precedence order for propositional connectors or logical
operators. This order should be followed while evaluating a propositional problem. Following is
the list of the precedence order for operators:

Precedence Operators

First Precedence Parenthesis

Second Precedence Negation

Third Precedence Conjunction(AND)

Fourth Precedence Disjunction(OR)

Fifth Precedence Implication

Six Precedence Biconditional

Logical equivalence:

Logical equivalence is one of the features of propositional logic. Two propositions are said to be
logically equivalent if and only if the columns in the truth table are identical to each other.

Let's take two propositions A and B, so for logical equivalence, we can write it as A⇔B. In
below truth table we can see that column for ¬A∨ B and A→B, are identical hence A is
Equivalent to B
Properties of Operators:

o Commutativity:
o P∧ Q= Q ∧ P, or
o P ∨ Q = Q ∨ P.
o Associativity:
o (P ∧ Q) ∧ R= P ∧ (Q ∧ R),
o (P ∨ Q) ∨ R= P ∨ (Q ∨ R)
o Identity element:
o P ∧ True = P,
o P ∨ True= True.
o Distributive:
o P∧ (Q ∨ R) = (P ∧ Q) ∨ (P ∧ R).
o P ∨ (Q ∧ R) = (P ∨ Q) ∧ (P ∨ R).
o DE Morgan's Law:
o ¬ (P ∧ Q) = (¬P) ∨ (¬Q)
o ¬ (P ∨ Q) = (¬ P) ∧ (¬Q).
o Double-negation elimination:
o ¬ (¬P) = P.

Limitations of Propositional logic:

o We cannot represent relations like ALL, some, or none with propositional logic.
Example:
a. All the girls are intelligent.
b. Some apples are sweet.
o Propositional logic has limited expressive power.
o In propositional logic, we cannot describe statements in terms of their properties or
logical relationships.
First-Order logic:
o First-order logic is another way of knowledge representation in artificial intelligence. It is
an extension to propositional logic.
o FOL is sufficiently expressive to represent the natural language statements in a concise
way.
o First-order logic is also known as Predicate logic or First-order predicate logic. First-
order logic is a powerful language that develops information about the objects in a more
easy way and can also express the relationship between those objects.
o First-order logic (like natural language) does not only assume that the world contains
facts like propositional logic but also assumes the following things in the world:
o Objects: A, B, people, numbers, colors, wars, theories, squares, pits, wumpus,
......
o Relations: It can be unary relation such as: red, round, is adjacent, or n-any
relation such as: the sister of, brother of, has color, comes between
o Function: Father of, best friend, third inning of, end of, ......
o As a natural language, first-order logic also has two main parts:
a. Syntax
b. Semantics

Syntax of First-Order logic:

The syntax of FOL determines which collection of symbols is a logical expression in first-order
logic. The basic syntactic elements of first-order logic are symbols. We write statements in short-
hand notation in FOL.

Basic Elements of First-order logic:

Following are the basic elements of FOL syntax:

Constant 1, 2, A, John, Mumbai, cat,....

Variables x, y, z, a, b,....

Predicates Brother, Father, >,....

Function sqrt, LeftLegOf, ....

Connectives ∧, ∨, ¬, ⇒, ⇔
Equality ==

Quantifier ∀, ∃

Atomic sentences:

o Atomic sentences are the most basic sentences of first-order logic. These sentences are
formed from a predicate symbol followed by a parenthesis with a sequence of terms.
o We can represent atomic sentences as Predicate (term1, term2, ......, term n).

Example: Ravi and Ajay are brothers: => Brothers(Ravi, Ajay).


Chinky is a cat: => cat (Chinky).

Complex Sentences:

o Complex sentences are made by combining atomic sentences using connectives.

First-order logic statements can be divided into two parts:

o Subject: Subject is the main part of the statement.


o Predicate: A predicate can be defined as a relation, which binds two atoms together in a
statement.

Consider the statement: "x is an integer.", it consists of two parts, the first part x is the subject
of the statement and second part "is an integer," is known as a predicate.

Quantifiers in First-order logic:


o A quantifier is a language element which generates quantification, and quantification
specifies the quantity of specimen in the universe of discourse.
o These are the symbols that permit to determine or identify the range and scope of the
variable in the logical expression. There are two types of quantifier:
a. Universal Quantifier, (for all, everyone, everything)
b. Existential quantifier, (for some, at least one).

Universal Quantifier:

Universal quantifier is a symbol of logical representation, which specifies that the statement
within its range is true for everything or every instance of a particular thing.

The Universal quantifier is represented by a symbol ∀, which resembles an inverted A.

If x is a variable, then ∀x is read as:

o For all x
o For each x
o For every x.

Example:

All man drink coffee.

Let a variable x which refers to a cat so all x can be represented in UOD as below:

∀x man(x) → drink (x, coffee).

It will be read as: There are all x where x is a man who drink coffee.
Existential Quantifier:

Existential quantifiers are the type of quantifiers, which express that the statement within its
scope is true for at least one instance of something.

It is denoted by the logical operator ∃, which resembles as inverted E. When it is used with a
predicate variable then it is called as an existential quantifier.

If x is a variable, then existential quantifier will be ∃x or ∃(x). And it will be read as:

o There exists a 'x.'


o For some 'x.'
o For at least one 'x.'

Example:

Some boys are intelligent.

∃x: boys(x) ∧ intelligent(x)

It will be read as: There are some x where x is a boy who is intelligent.

Points to remember:
o The main connective for universal quantifier ∀ is implication →.
o The main connective for existential quantifier ∃ is and ∧.

Properties of Quantifiers:
o In universal quantifier, ∀x∀y is similar to ∀y∀x.
o In Existential quantifier, ∃x∃y is similar to ∃y∃x.
o ∃x∀y is not similar to ∀y∃x.

Some Examples of FOL using quantifier:

1. All birds fly. In this question the predicate is "fly(bird)."


And since there are all birds who fly so it will be represented as follows.
∀x bird(x) →fly(x).

2. Every man respects his parent. In this question, the predicate is "respect(x, y)," where
x=man, and y= parent. Since there is every man so will use ∀, and it will be represented as
follows:
∀x man(x) → respects (x, parent).

3. Some boys play cricket. In this question, the predicate is "play(x, y)," where x= boys, and y=
game. Since there are some boys so we will use ∃, and it will be represented as:
∃x boys(x) → play(x, cricket).

4. Not all students like both Mathematics and Science. In this question, the predicate is
"like(x, y)," where x= student, and y= subject. Since there are not all students, so we will
use ∀ with negation, so following representation for this:
¬∀ (x) [ student(x) → like(x, Mathematics) ∧ like(x, Science)].

5. Only one student failed in Mathematics. In this question, the predicate is "failed(x, y),"
where x= student, and y= subject. Since there is only one student who failed in Mathematics,
so we will use following representation for this:
∃(x) [ student(x) → failed (x, Mathematics) ∧∀ (y) [¬(x==y) ∧ student(y) → ¬failed
(x, Mathematics)].

Free and Bound Variables:

The quantifiers interact with variables which appear in a suitable way. There are two types of
variables in First-order logic which are given below:

Free Variable: A variable is said to be a free variable in a formula if it occurs outside the scope
of the quantifier.
Example: ∀x ∃(y)[P (x, y, z)], where z is a free variable.

Bound Variable: A variable is said to be a bound variable in a formula if it occurs within the
scope of the quantifier.

Example: ∀x [A (x) B( y)], here x and y are the bound variables.

Situation Calculus:

The idea behind situation calculus is that (reachable) states are definable in terms of the actions
required to reach them. These reachable states are called situations. What is true in a situation
can be defined in terms of relations with the situation as an argument. Situation calculus can be
seen as a relational version of the feature-based representation of actions.
Here we only consider single agents, a fully observable environment, and deterministic actions.
Situation calculus is defined in terms of situations. A situation is either
 init, the initial situation, or
 do(A,S), the situation resulting from doing action A in situation S, if it is possible to do
action A in situation S.

Example 14.1: Consider the domain of Figure 3.1. Suppose in the initial situation, init, the robot,
Rob, is at location o109 and there is a key k1 at the mail room and a package at storage.
do(move(rob,o109,o103), init)
is the situation resulting from Rob moving from position o109 in situation init to position o103.
In this situation, Rob is at o103, the key k1 is still at mail, and the package is at storage.
The situation

do(move(rob,o103,mail),

do(move(rob,o109,o103),

init))
is one in which the robot has moved from position o109 to o103 to mail and is currently at mail.
Suppose Rob then picks up the key, k1. The resulting situation is

do(pickup(rob,k1),
do(move(rob,o103,mail),

do(move(rob,o109,o103),

init))).
In this situation, Rob is at position mail carrying the key k1.

A situation can be associated with a state. There are two main differences between situations and
states:
 Multiple situations may refer to the same state if multiple sequences of actions lead to the
same state. That is, equality between situations is not the same as equality between states.
 Not all states have corresponding situations. A state is reachable if a sequence of actions
exists that can reach that state from the initial state. States that are not reachable do not
have a corresponding situation.

Some do(A,S) terms do not correspond to any state. However, sometimes an agent must reason
about such a (potential) situation without knowing if A is possible in state S, or if S is possible.
Example 14.2: The term do(unlock(rob,door1),init) does not denote a state at all, because it is
not possible for Rob to unlock the door when Rob is not at the door and does not have the key.

A static relation is a relation for which the truth value does not depend on the situation; that is,
its truth value is unchanging through time. A dynamic relation is a relation for which the truth
value depends on the situation. To represent what is true in a situation, predicate symbols
denoting dynamic relations have a situation argument so that the truth can depend on the
situation. A predicate symbol with a situation argument is called a fluent.
Example 14.3: The relation at(O,L,S) is true when object O is at location L in situation S.
Thus, at is a fluent.
The atom
at(rob,o109,init)

is true if the robot rob is at position o109 in the initial situation. The atom
at(rob,o103,do(move(rob,o109,o103), init))

is true if robot rob is at position o103 in the situation resulting from rob moving from
position o109 to position o103 from the initial situation. The atom
at(k1,mail,do(move(rob,o109,o103), init))

is true if k1 is at position mail in the situation resulting from rob moving from position o109 to
position o103 from the initial situation.

A dynamic relation is axiomatized by specifying the situations in which it is true. Typically, this
is done inductively in terms of the structure of situations.
 Axioms with init as the situation parameter are used to specify what is true in the initial
situation.
 A primitive relation is defined by specifying when it is true in situations of the
form do(A,S) in terms of what is true in situation S. That is, primitive relations are defined
in terms of what is true at the previous situation.
 A derived relation is defined using clauses with a variable in the situation argument. The
truth of a derived relation in a situation depends on what else is true in the same situation.
 Static relations are defined without reference to the situation.

Example 14.4: Suppose the delivery robot, Rob, is in the domain depicted in Figure 3.1. Rob is
at location o109, the parcel is in the storage room, and the key is in the mail room. The following
axioms describe this initial situation:
at(rob,o109,init).
at(parcel,storage,init).
at(k1,mail,init).

The adjacent relation is a dynamic, derived relation defined as follows:


adjacent(o109,o103,S).
adjacent(o103,o109,S).
adjacent(o109,storage,S).
adjacent(storage,o109,S).
adjacent(o109,o111,S).
adjacent(o111,o109,S).
adjacent(o103,mail,S).
adjacent(mail,o103,S).
adjacent(lab2,o109,S).
adjacent(P1,P2,S)←
between(Door,P1,P2)∧
unlocked(Door,S).

Notice the free S variable; these clauses are true for all situations. We cannot omit the S because
which rooms are adjacent depends on whether a door is unlocked. This can change from situation
to situation.
The between relation is static and does not require a situation variable:
between(door1,o103,lab2).

We also distinguish whether or not an agent is being carried. If an object is not being carried, we
say that the object is sitting at its location. We distinguish this case because an object being
carried moves with the object carrying it. An object is at a location if it is sitting at that location
or is being carried by an object at that location. Thus, at is a derived relation:
at(Ob,P,S)←
sitting_at(Ob,P,S).
at(Ob,P,S)←
carrying(Ob1,Ob,S)∧
at(Ob1,P,S).

Note that this definition allows for Rob to be carrying a bag, which, in turn, is carrying a book.

The precondition of an action specifies when it is possible to carry out the action. The
relation poss(A,S) is true when action A is possible in situation S. This is typically a derived
relation.
Example 14.5: An agent can always put down an object it is carrying:
poss(putdown(Ag,Obj),S) ←
carrying(Ag,Obj,S).

For the move action, an autonomous agent can move from its current position to an adjacent
position:
poss(move(Ag,P1,P2),S) ←
autonomous(Ag) ∧
adjacent(P1,P2,S)∧
sitting_at(Ag,P1,S) .

The precondition for the unlock action is more complicated. The agent must be at the correct side
of the door and carrying the appropriate key:
poss(unlock(Ag,Door),S)←
autonomous(Ag)∧
between(Door,P1,P2)∧
at(Ag,P1,S)∧
opens(Key,Door)∧
carrying(Ag,Key,S).

We do not assume that the between relation is symmetric. Some doors can only open one way.

We define what is true in each situation recursively in terms of the previous situation and of what
action occurred between the situations. As in the feature-based representation of actions, causal
rules specify when a relation becomes true and frame rules specify when a relation remains
true.
Example 14.6: The primitive unlocked relation can be defined by specifying how different
actions can affect its being true. The door is unlocked in the situation resulting from an unlock
action, as long as the unlock action was possible. This is represented using the following causal
rule:
unlocked(Door,do(unlock(Ag,Door),S)) ←
poss(unlock(Ag,Door),S).
Suppose the only action to make the door locked is to lock the door. Thus, unlocked is true in a
situation following an action if it was true before, if the action was not to lock the door, and if the
action was possible:
unlocked(Door,do(A,S))←
unlocked(Door,S)∧
A≠lock(Door) ∧
poss(A,S).

This is a frame rule.

Example 14.7: The carrying predicate can be defined as follows.


An agent is carrying an object after picking up the object:
carrying(Ag,Obj,do(pickup(Ag,Obj),S)) ←
poss(pickup(Ag,Obj),S).

The only action that undoes the carrying predicate is the putdown action. Thus, carrying is true
after an action if it was true before the action, and the action was not to put down the object. This
is represented in the frame rule:
carrying(Ag,Obj,do(A,S)) ←
carrying(Ag,Obj,S)∧
poss(A,S)∧
A ≠putdown(Ag,Obj).

Example 14.8: The atom sitting_at(Obj,Pos,S1) is true in a situation S1 resulting from


object Obj moving to Pos, as long as the action was possible:
sitting_at(Obj,Pos,do(move(Obj,Pos0,Pos),S)) ←
poss(move(Obj,Pos0,Pos),S).

The other action that makes sitting_at true is the putdown action. An object is sitting at the
location where the agent who put it down was located:
sitting_at(Obj,Pos,do(putdown(Ag,Obj),S)) ←
poss(putdown(Ag,Obj),S)∧
at(Ag,Pos,S).

The only other time that sitting_at is true in a (non-initial) situation is when it was true in the
previous situation and it was not undone by an action. The only actions that undo sitting_at is
a move action or a pickup action. This can be specified by the following frame axiom:
sitting_at(Obj,Pos,do(A,S) ) ←
poss(A,S) ∧
sitting_at(Obj,Pos,S) ∧
∀Pos1 A≠move(Obj,Pos,Pos1) ∧
∀Ag A≠pickup(Ag,Obj) .
Note that the quantification in the body is not the standard quantification for rules. This can be
represented using negation as failure:
sitting_at(Obj,Pos,do(A,S) ) ←
poss(A,S) ∧
sitting_at(Obj,Pos,S) ∧
∼move_action(A,Obj,Pos) ∧
∼pickup_action(A,Obj) .
move_action(move(Obj,Pos,Pos1),Obj,Pos).
pickup_action(pickup(Ag,Obj),Obj).

These clauses are designed not to have a free variable in the scope of the negation.

Example 14.9: Situation calculus can represent more complicated actions than can be
represented with simple addition and deletion of propositions in the state description.
Consider the drop_everything action in which an agent drops everything it is carrying. In
situation calculus, the following axiom can be added to the definition of sitting_at to say that
everything the agent was carrying is now on the ground:
sitting_at(Obj,Pos,do(drop_everything(Ag) ,S) ) ←
poss(drop_everything(Ag),S) ∧
at(Ag,Pos,S) ∧
carrying(Ag,Obj,S) .

A frame axiom for carrying specifies that an agent is not carrying an object after
a drop_everything action.
carrying(Ag,Obj,do(A ,S) ) ←
poss(A,S) ∧
carrying(Ag,Obj,S)∧
A ≠drop_everything(Ag)∧
A ≠putdown(Ag,Obj).

The drop_everything action thus affects an unbounded number of objects.

Situation calculus is used for planning by asking for a situation in which a goal is true. Answer
extraction is used to find a situation in which the goal is true. This situation can be interpreted as
a sequence of actions for the agent to perform.
Example 14.10: Suppose the goal is for the robot to have the key k1. The following query asks
for a situation where this is true:
? carrying(rob,k1,S).
This query has the following answer:
S=do(pickup(rob,k1),

do(move(rob,o103,mail),

do(move(rob,o109,o103),

init))).
The preceding answer can be interpreted as a way for Rob to get the key: it moves
from o109 to o103, then to mail, where it picks up the key.
The goal of delivering the parcel (which is, initially, in the lounge, lng) to o111 can be asked
with the query
? at(parcel,o111,S).

This query has the following answer:

S=do(move(rob, o109, o111),

do(move(rob, lng, o109),

do(pickup(rob, parcel),

do(move(rob, o109, lng), init)))).


Therefore, Rob should go to the lounge, pick up the parcel, go back to o109, and then go to o111.

Using the top-down proof procedure on the situation calculus definitions is very inefficient,
because a frame axiom is almost always applicable. A complete proof procedure, such as
iterative deepening, searches through all permutations of actions even if they are not relevant to
the goal.

Theoram Proving in First Order Logic

Resolution

Resolution is a theorem proving technique that proceeds by building refutation proofs, i.e.,
proofs by contradictions. It was invented by a Mathematician John Alan Robinson in the year
1965.

Resolution is used, if there are various statements are given, and we need to prove a conclusion
of those statements. Unification is a key concept in proofs by resolutions. Resolution is a single
inference rule which can efficiently operate on the conjunctive normal form or clausal form.
Clause: Disjunction of literals (an atomic sentence) is called a clause. It is also known as a unit
clause.

Conjunctive Normal Form: A sentence represented as a conjunction of clauses is said to


be conjunctive normal form or CNF.

The resolution inference rule:

The resolution rule for first-order logic is simply a lifted version of the propositional rule.
Resolution can resolve two clauses if they contain complementary literals, which are assumed to
be standardized apart so that they share no variables.

Where li and mj are complementary literals.

This rule is also called the binary resolution rule because it only resolves exactly two literals.

Example:

We can resolve two clauses which are given below:

[Animal (g(x) V Loves (f(x), x)] and [¬ Loves(a, b) V ¬Kills(a, b)]

Where two complimentary literals are: Loves (f(x), x) and ¬ Loves (a, b)

These literals can be unified with unifier θ= [a/f(x), and b/x] , and it will generate a resolvent
clause:

[Animal (g(x) V ¬ Kills(f(x), x)].

Steps for Resolution:


1. Conversion of facts into first-order logic.
2. Convert FOL statements into CNF
3. Negate the statement which needs to prove (proof by contradiction)
4. Draw resolution graph (unification).
To better understand all the above steps, we will take an example in which we will apply
resolution.

Example:

a. John likes all kind of food.


b. Apple and vegetable are food
c. Anything anyone eats and not killed is food.
d. Anil eats peanuts and still alive
e. Harry eats everything that Anil eats.
Prove by resolution that:
f. John likes peanuts.

Step-1: Conversion of Facts into FOL

In the first step we will convert all the given statements into its first order logic.

Step-2: Conversion of FOL into CNF

In First order logic resolution, it is required to convert the FOL into CNF as CNF form makes
easier for resolution proofs.

o Eliminate all implication (→) and rewrite


a. ∀x ¬ food(x) V likes(John, x)
b. food(Apple) Λ food(vegetables)
c. ∀x ∀y ¬ [eats(x, y) Λ ¬ killed(x)] V food(y)
d. eats (Anil, Peanuts) Λ alive(Anil)
e. ∀x ¬ eats(Anil, x) V eats(Harry, x)
f. ∀x¬ [¬ killed(x) ] V alive(x)
g. ∀x ¬ alive(x) V ¬ killed(x)
h. likes(John, Peanuts).
o Move negation (¬)inwards and rewrite
. ∀x ¬ food(x) V likes(John, x)
a. food(Apple) Λ food(vegetables)
b. ∀x ∀y ¬ eats(x, y) V killed(x) V food(y)
c. eats (Anil, Peanuts) Λ alive(Anil)
d. ∀x ¬ eats(Anil, x) V eats(Harry, x)
e. ∀x ¬killed(x) ] V alive(x)
f. ∀x ¬ alive(x) V ¬ killed(x)
g. likes(John, Peanuts).
o Rename variables or standardize variables
. ∀x ¬ food(x) V likes(John, x)
a. food(Apple) Λ food(vegetables)
b. ∀y ∀z ¬ eats(y, z) V killed(y) V food(z)
c. eats (Anil, Peanuts) Λ alive(Anil)
d. ∀w¬ eats(Anil, w) V eats(Harry, w)
e. ∀g ¬killed(g) ] V alive(g)
f. ∀k ¬ alive(k) V ¬ killed(k)
g. likes(John, Peanuts).
h.
o Eliminate existential instantiation quantifier by elimination. In this step, we will
eliminate existential quantifier ∃, and this process is known as Skolemization. But in this
example problem since there is no existential quantifier so all the statements will remain
same in this step.
o Drop Universal quantifiers. In this step we will drop all universal quantifier since all
the statements are not implicitly quantified so we don't need it.
. ¬ food(x) V likes(John, x)
a. food(Apple)
b. food(vegetables)
c. ¬ eats(y, z) V killed(y) V food(z)
d. eats (Anil, Peanuts)
e. alive(Anil)
f. ¬ eats(Anil, w) V eats(Harry, w)
g. killed(g) V alive(g)
h. ¬ alive(k) V ¬ killed(k)
i. likes(John, Peanuts).
o Distribute conjunction ∧ over disjunction ¬. This step will not make any change in this
problem.

Step-3: Negate the statement to be proved

In this statement, we will apply negation to the conclusion statements, which will be written as
¬likes(John, Peanuts)

Step-4: Draw Resolution graph:

Now in this step, we will solve the problem by resolution tree using substitution. For the above
problem, it will be given as follows:

Hence the negation of the conclusion has been proved as a complete contradiction with the given
set of statements.
Explanation of Resolution graph:
o In the first step of resolution graph, ¬likes(John, Peanuts) , and likes(John, x) get
resolved(canceled) by substitution of {Peanuts/x}, and we are left with ¬ food(Peanuts)
o In the second step of the resolution graph, ¬ food(Peanuts) , and food(z) get resolved
(canceled) by substitution of { Peanuts/z}, and we are left with ¬ eats(y, Peanuts) V
killed(y) .
o In the third step of the resolution graph, ¬ eats(y, Peanuts) and eats (Anil, Peanuts) get
resolved by substitution {Anil/y}, and we are left with Killed(Anil) .
o In the fourth step of the resolution graph, Killed(Anil) and ¬ killed(k) get resolve by
substitution {Anil/k}, and we are left with ¬ alive(Anil) .
o In the last step of the resolution graph ¬ alive(Anil) and alive(Anil) get resolved.

Planning: the task of coming up with a sequence of actions that will achieve a goal Search-based
problem-solving agent Logical planning agent Complex/large scale problems? For the
discussion, we consider classical planning environments that are fully observable, deterministic,
finite, static and discrete (in time, action, objects and effects)

Partial Order Planning

The forward and regression planners enforce a total ordering on actions at all stages of the
planning process. The CSP planner commits to the particular time that the action will be carried
out. This means that those planners have to commit to an ordering of actions that cannot occur
concurrently when adding them to a partial plan, even if there is no particular reason to put one
action before another.

The idea of a partial-order planner is to have a partial ordering between actions and only
commit to an ordering between actions when forced. This is sometimes also called a non-linear
planner, which is a misnomer because such planners often produce a linear plan.

A partial ordering is a less-than relation that is transitive and asymmetric. A partial-order


plan is a set of actions together with a partial ordering, representing a "before" relation on
actions, such that any total ordering of the actions, consistent with the partial ordering, will solve
the goal from the initial state. Write act0 < act1 if action act0 is before action act1 in the partial
order. This means that action act0 must occur before action act1.

An action, other than start or finish, will be in a partial-order plan to achieve a precondition of an
action in the plan. Each precondition of an action in the plan is either true in the initial state, and
so achieved by start, or there will be an action in the plan that achieves it.
We must ensure that the actions achieve the conditions they were assigned to achieve. Each
precondition P of an action act1 in a plan will have an action act0 associated with it such
that act0 achieves precondition P for act1. The triple ⟨act0,P,act1⟩ is a causal link. The partial
order specifies that action act0 occurs before action act1, which is written as act0 < act1. Any
other action A that makes P false must either be before act0 or after act1.

Informally, a partial-order planner works as follows: Begin with the actions start and finish and
the partial order start < finish. The planner maintains an agenda that is a set of ⟨P,A⟩ pairs,
where A is an action in the plan and P is an atom that is a precondition of A that must be
achieved. Initially the agenda contains pairs ⟨G,finish⟩, where G is an atom that must be true in
the goal state.

At each stage in the planning process, a pair ⟨G,act1⟩ is selected from the agenda, where P is a
precondition for action act1. Then an action, act0, is chosen to achieve P. That action is either
already in the plan - it could be the start action, for example - or it is a new action that is added
to the plan. Action act0 must happen before act1 in the partial order. It adds a causal link that
records that act0 achieves P for action act1. Any action in the plan that deletes P must happen
either before act0 or after act1. If act0 is a new action, its preconditions are added to the agenda,
and the process continues until the agenda is empty.

This is a non-deterministic procedure. The "choose" and the "either ...or ..." form choices that
must be searched over. There are two choices that require search:
 which action is selected to achieve G and
 whether an action that deletes G happens before act0 or after act1.
 non-deterministic procedure

Uncertain Knowledge and Reasoning

Uncertainty:

Till now, we have learned knowledge representation using first-order logic and propositional
logic with certainty, which means we were sure about the predicates. With this knowledge
representation, we might write A→B, which means if A is true then B is true, but consider a
situation where we are not sure about whether A is true or not then we cannot express this
statement, this situation is called uncertainty.

So to represent uncertain knowledge, where we are not sure about the predicates, we need
uncertain reasoning or probabilistic reasoning.
Following are some leading causes of uncertainty to occur in the real world.

1. Information occurred from unreliable sources.


2. Experimental Errors
3. Equipment fault
4. Temperature variation
5. Climate change.

Probabilistic reasoning:

Probabilistic reasoning is a way of knowledge representation where we apply the concept of


probability to indicate the uncertainty in knowledge. In probabilistic reasoning, we combine
probability theory with logic to handle the uncertainty.

We use probability in probabilistic reasoning because it provides a way to handle the uncertainty
that is the result of someone's laziness and ignorance.

In the real world, there are lots of scenarios, where the certainty of something is not confirmed,
such as "It will rain today," "behavior of someone for some situations," "A match between two
teams or two players." These are probable sentences for which we can assume that it will happen
but not sure about it, so here we use probabilistic reasoning.

Need of probabilistic reasoning in AI:

o When there are unpredictable outcomes.


o When specifications or possibilities of predicates becomes too large to handle.
o When an unknown error occurs during an experiment.

In probabilistic reasoning, there are two ways to solve problems with uncertain knowledge:

o Bayes' rule
o Bayesian Statistics
o As probabilistic reasoning uses probability and related terms, so before understanding
probabilistic reasoning, let's understand some common terms:
o Probability: Probability can be defined as a chance that an uncertain event will occur. It
is the numerical measure of the likelihood that an event will occur. The value of
probability always remains between 0 and 1 that represent ideal uncertainties.

0 ≤ P(A) ≤ 1, where P(A) is the probability of an event A.


P(A) = 0, indicates total uncertainty in an event A.
P(A) =1, indicates total certainty in an event A.

We can find the probability of an uncertain event by using the below formula.

o P(¬A) = probability of a not happening event.


o P(¬A) + P(A) = 1.

Event: Each possible outcome of a variable is called an event.

Sample space: The collection of all possible events is called sample space.

Random variables: Random variables are used to represent the events and objects in the real
world.

Prior probability: The prior probability of an event is probability computed before observing
new information.

Posterior Probability: The probability that is calculated after all evidence or information has
taken into account. It is a combination of prior probability and new information.

Conditional probability:

Conditional probability is a probability of occurring an event when another event has already
happened.

Let's suppose, we want to calculate the event A when event B has already occurred, "the
probability of A under the conditions of B", it can be written as:

Where P(A⋀B)= Joint probability of a and B

P(B)= Marginal probability of B.

If the probability of A is given and we need to find the probability of B, then it will be given as:
It can be explained by using the below Venn diagram, where B is occurred event, so sample
space will be reduced to set B, and now we can only calculate event A when event B is already
occurred by dividing the probability of P(A⋀B) by P( B ).

Example:

In a class, there are 70% of the students who like English and 40% of the students who likes
English and mathematics, and then what is the percent of students those who like English also
like mathematics?

Solution:
Let, A is an event that a student likes Mathematics
B is an event that a student likes English.
Hence, 57% are the students who like English also like Mathematics.

Bayesian Networks
Bayesian belief network is key computer technology for dealing with probabilistic events and to
solve a problem which has uncertainty. We can define a Bayesian network as:

"A Bayesian network is a probabilistic graphical model which represents a set of variables and
their conditional dependencies using a directed acyclic graph."
It is also called a Bayes network, belief network, decision network, or Bayesian model.

Bayesian networks are probabilistic, because these networks are built from a probability
distribution, and also use probability theory for prediction and anomaly detection.

Real world applications are probabilistic in nature, and to represent the relationship between
multiple events, we need a Bayesian network. It can also be used in various tasks
including prediction, anomaly detection, diagnostics, automated insight, reasoning, time
series prediction, and decision making under uncertainty.

Bayesian Network can be used for building models from data and experts opinions, and it
consists of two parts:

o Directed Acyclic Graph


o Table of conditional probabilities.

The generalized form of Bayesian network that represents and solve decision problems under
uncertain knowledge is known as an Influence diagram.

A Bayesian network graph is made up of nodes and Arcs (directed links), where:

o Each node corresponds to the random variables, and a variable can


be continuous or discrete.
o Arc or directed arrows represent the causal relationship or conditional probabilities
between random variables. These directed links or arrows connect the pair of nodes in the
graph.
These links represent that one node directly influence the other node, and if there is no
directed link that means that nodes are independent with each other
o In the above diagram, A, B, C, and D are random variables represented by
the nodes of the network graph.
o If we are considering node B, which is connected with node A by a directed
arrow, then node A is called the parent of Node B.
o Node C is independent of node A.

The Bayesian network has mainly two components:

o Causal Component
o Actual numbers

Each node in the Bayesian network has condition probability distribution P(Xi |Parent(Xi) ),
which determines the effect of the parent on that node.

Bayesian network is based on Joint probability distribution and conditional probability. So let's
first understand the joint probability distribution:

Joint probability distribution:

If we have variables x1, x2, x3,....., xn, then the probabilities of a different combination of x1,
x2, x3.. xn, are known as Joint probability distribution.

P[x1, x2, x3,....., xn], it can be written as the following way in terms of the joint probability
distribution.

= P[x1| x2, x3,....., xn]P[x2, x3,....., xn]

= P[x1| x2, x3,....., xn]P[x2|x3,....., xn]....P[xn-1|xn]P[xn].

In general for each variable Xi, we can write the equation as:

P(Xi|Xi-1,........., X1) = P(Xi |Parents(Xi ))

Explanation of Bayesian network:

Let's understand the Bayesian network through an example by creating a directed acyclic graph:

Example: Harry installed a new burglar alarm at his home to detect burglary. The alarm reliably
responds at detecting a burglary but also responds for minor earthquakes. Harry has two
neighbors David and Sophia, who have taken a responsibility to inform Harry at work when they
hear the alarm. David always calls Harry when he hears the alarm, but sometimes he got
confused with the phone ringing and calls at that time too. On the other hand, Sophia likes to
listen to high music, so sometimes she misses to hear the alarm. Here we would like to compute
the probability of Burglary Alarm.

Problem:

Calculate the probability that alarm has sounded, but there is neither a burglary, nor an
earthquake occurred, and David and Sophia both called the Harry.

Solution:

o The Bayesian network for the above problem is given below. The network structure is
showing that burglary and earthquake is the parent node of the alarm and directly
affecting the probability of alarm's going off, but David and Sophia's calls depend on
alarm probability.
o The network is representing that our assumptions do not directly perceive the burglary
and also do not notice the minor earthquake, and they also not confer before calling.
o The conditional distributions for each node are given as conditional probabilities table or
CPT.
o Each row in the CPT must be sum to 1 because all the entries in the table represent an
exhaustive set of cases for the variable.
o In CPT, a boolean variable with k boolean parents contains 2K probabilities. Hence, if
there are two parents, then CPT will contain 4 probability values

List of all events occurring in this network:

o Burglary (B)
o Earthquake(E)
o Alarm(A)
o David Calls(D)
o Sophia calls(S)

We can write the events of problem statement in the form of probability: P[D, S, A, B, E], can
rewrite the above probability statement using joint probability distribution:

P[D, S, A, B, E]= P[D | S, A, B, E]. P[S, A, B, E]

=P[D | S, A, B, E]. P[S | A, B, E]. P[A, B, E]

= P [D| A]. P [ S| A, B, E]. P[ A, B, E]


= P[D | A]. P[ S | A]. P[A| B, E]. P[B, E]

= P[D | A ]. P[S | A]. P[A| B, E]. P[B |E]. P[E]

Let's take the observed probability for the Burglary and earthquake component:
P(B= True) = 0.002, which is the probability of burglary.
P(B= False)= 0.998, which is the probability of no burglary.
P(E= True)= 0.001, which is the probability of a minor earthquake
P(E= False)= 0.999, Which is the probability that an earthquake not occurred.
We can provide the conditional probabilities as per the below tables:

Conditional probability table for Alarm A:


The Conditional probability of Alarm A depends on Burglar and earthquake:

B E P(A= True) P(A= False)

True True 0.94 0.06

True False 0.95 0.04

False True 0.31 0.69

False False 0.001 0.999


Conditional probability table for David Calls:

The Conditional probability of David that he will call depends on the probability of Alarm.

A P(D= True) P(D= False)

True 0.91 0.09

False 0.05 0.95

Conditional probability table for Sophia Calls:

The Conditional probability of Sophia that she calls is depending on its Parent Node "Alarm."

A P(S= True) P(S= False)

True 0.75 0.25

False 0.02 0.98

From the formula of joint distribution, we can write the problem statement in the form of
probability distribution:

P(S, D, A, ¬B, ¬E) = P (S|A) *P (D|A)*P (A|¬B ^ ¬E) *P (¬B) *P (¬E).

= 0.75* 0.91* 0.001* 0.998*0.999

= 0.00068045.

Hence, a Bayesian network can answer any query about the domain by using Joint
distribution.

The semantics of Bayesian Network:

There are two ways to understand the semantics of the Bayesian network, which is given below:

1. To understand the network as the representation of the Joint probability distribution.

It is helpful to understand how to construct the network.

2. To understand the network as an encoding of a collection of conditional independence


statements.
Natural Language Processing

Natural Language Processing (NLP) refers to AI method of communicating with an intelligent


systems using a natural language such as English.
Processing of Natural Language is required when you want an intelligent system like robot to perform
as per your instructions, when you want to hear decision from a dialogue based clinical expert system,
etc.
The field of NLP involves making computers to perform useful tasks with the natural languages
humans use. The input and output of an NLP system can be −

 Speech
 Written Text

Components of NLP:

There are two components of NLP as given −


1. Natural Language Understanding (NLU)
Understanding involves the following tasks −

 Mapping the given input in natural language into useful representations.


 Analysing different aspects of the language.
2. Natural Language Generation (NLG)
It is the process of producing meaningful phrases and sentences in the form of natural language from
some internal representation.
It involves −
 Text planning − It includes retrieving the relevant content from knowledge base.
 Sentence planning − It includes choosing required words, forming meaningful phrases,
setting tone of the sentence.
 Text Realization − It is mapping sentence plan into sentence structure.
The NLU is harder than NLG.

Difficulties in NLU:

NL has an extremely rich form and structure.


It is very ambiguous. There can be different levels of ambiguity −
 Lexical ambiguity − It is at very primitive level such as word-level.
 For example, treating the word “board” as noun or verb?
 Syntax Level ambiguity − A sentence can be parsed in different ways.
 For example, “He lifted the beetle with red cap.” − Did he use cap to lift the beetle or he lifted
a beetle that had red cap?
 Referential ambiguity − Referring to something using pronouns. For example, Rima went to
Gauri. She said, “I am tired.” − Exactly who is tired?
 One input can mean different meanings.
 Many inputs can mean the same thing.

NLP Terminology:

 Phonology − It is study of organizing sound systematically.


 Morphology − It is a study of construction of words from primitive meaningful units.
 Morpheme − It is primitive unit of meaning in a language.
 Syntax − It refers to arranging words to make a sentence. It also involves determining the
structural role of words in the sentence and in phrases.
 Semantics − It is concerned with the meaning of words and how to combine words into
meaningful phrases and sentences.
 Pragmatics − It deals with using and understanding sentences in different situations and how
the interpretation of the sentence is affected.
 Discourse − It deals with how the immediately preceding sentence can affect the interpretation
of the next sentence.
 World Knowledge − It includes the general knowledge about the world.

Steps in NLP:

There are general five steps −


 Lexical Analysis − It involves identifying and analyzing the structure of words. Lexicon of a
language means the collection of words and phrases in a language. Lexical analysis is dividing
the whole chunk of txt into paragraphs, sentences, and words.
 Syntactic Analysis (Parsing) − It involves analysis of words in the sentence for grammar and
arranging words in a manner that shows the relationship among the words. The sentence such
as “The school goes to boy” is rejected by English syntactic analyzer.
 Semantic Analysis − It draws the exact meaning or the dictionary meaning from the text. The
text is checked for meaningfulness. It is done by mapping syntactic structures and objects in
the task domain. The semantic analyzer disregards sentence such as “hot ice-cream”.
 Discourse Integration − The meaning of any sentence depends upon the meaning of the
sentence just before it. In addition, it also brings about the meaning of immediately succeeding
sentence.
 Pragmatic Analysis − During this, what was said is re-interpreted on what it actually meant.
It involves deriving those aspects of language which require real world knowledge.

Implementation Aspects of Syntactic Analysis

There are a number of algorithms researchers have developed for syntactic analysis, but we consider
only the following simple methods −

 Context-Free Grammar
 Top-Down Parser
Let us see them in detail –

 Context-Free Grammar
It is the grammar that consists rules with a single symbol on the left-hand side of the rewrite rules.
Let us create grammar to parse a sentence −
“The bird pecks the grains”
Articles (DET) − a | an | the
Nouns − bird | birds | grain | grains
Noun Phrase (NP) − Article + Noun | Article + Adjective + Noun
= DET N | DET ADJ N
Verbs − pecks | pecking | pecked
Verb Phrase (VP) − NP V | V NP
Adjectives (ADJ) − beautiful | small | chirping
The parse tree breaks down the sentence into structured parts so that the computer can easily
understand and process it. In order for the parsing algorithm to construct this parse tree, a set of rewrite
rules, which describe what tree structures are legal, need to be constructed.
These rules say that a certain symbol may be expanded in the tree by a sequence of other symbols.
According to first order logic rule, if there are two strings Noun Phrase (NP) and Verb Phrase (VP),
then the string combined by NP followed by VP is a sentence. The rewrite rules for the sentence are
as follows −
S → NP VP
NP → DET N | DET ADJ N
VP → V NP
Lexocon −
DET → a | the
ADJ → beautiful | perching
N → bird | birds | grain | grains
V → peck | pecks | pecking
The parse tree can be created as shown −
Now consider the above rewrite rules. Since V can be replaced by both, "peck" or "pecks", sentences
such as "The bird peck the grains" can be wrongly permitted. i. e. the subject-verb agreement error is
approved as correct.
Merit − The simplest style of grammar, therefore widely used one.
Demerits −
 They are not highly precise. For example, “The grains peck the bird”, is a syntactically correct
according to parser, but even if it makes no sense, parser takes it as a correct sentence.
 To bring out high precision, multiple sets of grammar need to be prepared. It may require a
completely different sets of rules for parsing singular and plural variations, passive sentences,
etc., which can lead to creation of huge set of rules that are unmanageable.

 Top-Down Parser
Here, the parser starts with the S symbol and attempts to rewrite it into a sequence of terminal
symbols that matches the classes of the words in the input sentence until it consists entirely of terminal
symbols.
These are then checked with the input sentence to see if it matched. If not, the process is started over
again with a different set of rules. This is repeated until a specific rule is found which describes the
structure of the sentence.
Merit − It is simple to implement.
Demerits −

 It is inefficient, as the search process has to be repeated if an error occurs.


 Slow speed of working.

Expert Systems

Expert systems (ES) are one of the prominent research domains of AI. It is introduced by the
researchers at Stanford University, Computer Science Department.

What are Expert Systems?

The expert systems are the computer applications developed to solve complex problems in a particular
domain, at the level of extra-ordinary human intelligence and expertise.
Characteristics of Expert Systems:

 High performance
 Understandable
 Reliable
 Highly responsive

Capabilities of Expert Systems:

The expert systems are capable of −

 Advising
 Instructing and assisting human in decision making
 Demonstrating
 Deriving a solution
 Diagnosing
 Explaining
 Interpreting input
 Predicting results
 Justifying the conclusion
 Suggesting alternative options to a problem
They are incapable of −

 Substituting human decision makers


 Possessing human capabilities
 Producing accurate output for inadequate knowledge base
 Refining their own knowledge

Components of Expert Systems:

The components of ES include −

 Knowledge Base
 Inference Engine
 User Interface
Let us see them one by one briefly –
 Knowledge Base

It contains domain-specific and high-quality knowledge.


Knowledge is required to exhibit intelligence. The success of any ES majorly depends upon the
collection of highly accurate and precise knowledge.
What is Knowledge?
The data is collection of facts. The information is organized as data and facts about the task
domain. Data, information, and past experience combined together are termed as knowledge.
Components of Knowledge Base:
The knowledge base of an ES is a store of both, factual and heuristic knowledge.
 Factual Knowledge − It is the information widely accepted by the Knowledge Engineers and
scholars in the task domain.
 Heuristic Knowledge − It is about practice, accurate judgement, one’s ability of evaluation,
and guessing.
Knowledge representation:
It is the method used to organize and formalize the knowledge in the knowledge base. It is in the form
of IF-THEN-ELSE rules.
Knowledge Acquisition:
The success of any expert system majorly depends on the quality, completeness, and accuracy of the
information stored in the knowledge base.
The knowledge base is formed by readings from various experts, scholars, and the Knowledge
Engineers. The knowledge engineer is a person with the qualities of empathy, quick learning, and
case analyzing skills.
He acquires information from subject expert by recording, interviewing, and observing him at work,
etc. He then categorizes and organizes the information in a meaningful way, in the form of IF-THEN-
ELSE rules, to be used by interference machine. The knowledge engineer also monitors the
development of the ES.

 Inference Engine

Use of efficient procedures and rules by the Inference Engine is essential in deducting a correct,
flawless solution.
In case of knowledge-based ES, the Inference Engine acquires and manipulates the knowledge from
the knowledge base to arrive at a particular solution.
In case of rule based ES, it −
 Applies rules repeatedly to the facts, which are obtained from earlier rule application.
 Adds new knowledge into the knowledge base if required.
 Resolves rules conflict when multiple rules are applicable to a particular case.
To recommend a solution, the Inference Engine uses the following strategies −

 Forward Chaining
 Backward Chaining
Forward Chaining
It is a strategy of an expert system to answer the question, “What can happen next?”
Here, the Inference Engine follows the chain of conditions and derivations and finally deduces the
outcome. It considers all the facts and rules, and sorts them before concluding to a solution.
This strategy is followed for working on conclusion, result, or effect. For example, prediction of share
market status as an effect of changes in interest rates.
Backward Chaining
With this strategy, an expert system finds out the answer to the question, “Why this happened?”
On the basis of what has already happened, the Inference Engine tries to find out which conditions
could have happened in the past for this result. This strategy is followed for finding out cause or
reason. For example, diagnosis of blood cancer in humans.

 User Interface

User interface provides interaction between user of the ES and the ES itself. It is generally Natural
Language Processing so as to be used by the user who is well-versed in the task domain. The user of
the ES need not be necessarily an expert in Artificial Intelligence.
It explains how the ES has arrived at a particular recommendation. The explanation may appear in
the following forms −

 Natural language displayed on screen.


 Verbal narrations in natural language.
 Listing of rule numbers displayed on the screen.
The user interface makes it easy to trace the credibility of the deductions.
Requirements of Efficient ES User Interface:
 It should help users to accomplish their goals in shortest possible way.
 It should be designed to work for user’s existing or desired work practices.
 Its technology should be adaptable to user’s requirements; not the other way round.
 It should make efficient use of user input.

Expert Systems Limitations:

No technology can offer easy and complete solution. Large systems are costly, require significant
development time, and computer resources. ESs have their limitations which include −
 Limitations of the technology
 Difficult knowledge acquisition
 ES are difficult to maintain
 High development costs

Applications of Expert System:

The following table shows where ES can be applied.

Application Description

Design Domain Camera lens design, automobile design.

Diagnosis Systems to deduce cause of disease from observed


Medical Domain
data, conduction medical operations on humans.

Comparing data continuously with observed system or with


Monitoring Systems prescribed behavior such as leakage monitoring in long
petroleum pipeline.

Process Control Systems Controlling a physical process based on monitoring.

Knowledge Domain Finding out faults in vehicles, computers.

Detection of possible fraud, suspicious transactions, stock


Finance/Commerce
market trading, Airline scheduling, cargo scheduling.

Expert System Technology:

There are several levels of ES technologies available. Expert systems technologies include −
 Expert System Development Environment − The ES development environment includes
hardware and tools. They are −
o Workstations, minicomputers, mainframes.
o High level Symbolic Programming Languages such as LISt Programming (LISP)
and PROgrammation en LOGique (PROLOG).
o Large databases.
 Tools − They reduce the effort and cost involved in developing an expert system to large
extent.
o Powerful editors and debugging tools with multi-windows.
o They provide rapid prototyping
o Have Inbuilt definitions of model, knowledge representation, and inference design.
 Shells − A shell is nothing but an expert system without knowledge base. A shell provides the
developers with knowledge acquisition, inference engine, user interface, and explanation
facility. For example, few shells are given below −
o Java Expert System Shell (JESS) that provides fully developed Java API for creating
an expert system.
o Vidwan, a shell developed at the National Centre for Software Technology, Mumbai in
1993. It enables knowledge encoding in the form of IF-THEN rules.

Development of Expert Systems: General Steps:

The process of ES development is iterative. Steps in developing the ES include −


Identify Problem Domain

 The problem must be suitable for an expert system to solve it.


 Find the experts in task domain for the ES project.
 Establish cost-effectiveness of the system.
Design the System
 Identify the ES Technology
 Know and establish the degree of integration with the other systems and databases.
 Realize how the concepts can represent the domain knowledge best.
Develop the Prototype
From Knowledge Base: The knowledge engineer works to −

 Acquire domain knowledge from the expert.


 Represent it in the form of If-THEN-ELSE rules.
Test and Refine the Prototype
 The knowledge engineer uses sample cases to test the prototype for any deficiencies in
performance.
 End users test the prototypes of the ES.
Develop and Complete the ES
 Test and ensure the interaction of the ES with all elements of its environment, including end
users, databases, and other information systems.
 Document the ES project well.
 Train the user to use ES.
Maintain the System
 Keep the knowledge base up-to-date by regular review and update.
 Cater for new interfaces with other information systems, as those systems evolve.

Benefits of Expert Systems:

 Availability − They are easily available due to mass production of software.


 Less Production Cost − Production cost is reasonable. This makes them affordable.
 Speed − They offer great speed. They reduce the amount of work an individual puts in.
 Less Error Rate − Error rate is low as compared to human errors.
 Reducing Risk − They can work in the environment dangerous to humans.
 Steady response − They work steadily without getting motional, tensed or fatigued.

Robotics

Robotics is a domain in artificial intelligence that deals with the study of creating intelligent and
efficient robots.

What are Robots?

Robots are the artificial agents acting in real world environment.


Objective:
Robots are aimed at manipulating the objects by perceiving, picking, moving, modifying the physical
properties of object, destroying it, or to have an effect thereby freeing manpower from doing repetitive
functions without getting bored, distracted, or exhausted.

What is Robotics?

Robotics is a branch of AI, which is composed of Electrical Engineering, Mechanical Engineering,


and Computer Science for designing, construction, and application of robots.

Aspects of Robotics:
 The robots have mechanical construction, form, or shape designed to accomplish a particular
task.
 They have electrical components which power and control the machinery.
 They contain some level of computer program that determines what, when and how a robot
does something.

Difference in Robot System and Other AI Program:

Here is the difference between the two −

AI Programs Robots

They usually operate in computer- They operate in real physical world


stimulated worlds.

The input to an AI program is in symbols Inputs to robots is analog signal in the form of
and rules. speech waveform or images

They need general purpose computers to They need special hardware with sensors and
operate on. effectors.
Robot Locomotion:

Locomotion is the mechanism that makes a robot capable of moving in its environment. There are
various types of locomotions −

 Legged
 Wheeled
 Combination of Legged and Wheeled Locomotion
 Tracked slip/skid

 Legged Locomotion
 This type of locomotion consumes more power while demonstrating walk, jump, trot, hop,
climb up or down, etc.
 It requires more number of motors to accomplish a movement. It is suited for rough as well
as smooth terrain where irregular or too smooth surface makes it consume more power for a
wheeled locomotion. It is little difficult to implement because of stability issues.
 It comes with the variety of one, two, four, and six legs. If a robot has multiple legs then leg
coordination is necessary for locomotion.
The total number of possible gaits (a periodic sequence of lift and release events for each of the total
legs) a robot can travel depends upon the number of its legs.
If a robot has k legs, then the number of possible events N = (2k-1)!.
In case of a two-legged robot (k=2), the number of possible events is N = (2k-1)! = (2*2-1)! = 3! = 6.
Hence there are six possible different events −

 Lifting the Left leg


 Releasing the Left leg
 Lifting the Right leg
 Releasing the Right leg
 Lifting both the legs together
 Releasing both the legs together

In case of k=6 legs, there are 39916800 possible events. Hence the complexity of robots is directly
proportional to the number of legs.
 Wheeled Locomotion
It requires fewer number of motors to accomplish a movement. It is little easy to implement as there
are less stability issues in case of more number of wheels. It is power efficient as compared to legged
locomotion.
 Standard wheel − Rotates around the wheel axle and around the contact
 Castor wheel − Rotates around the wheel axle and the offset steering joint.
 Swedish 45o and Swedish 90o wheels − Omni-wheel, rotates around the contact point, around
the wheel axle, and around the rollers.
 Ball or spherical wheel − Omnidirectional wheel, technically difficult to implement.

 Slip/Skid Locomotion
In this type, the vehicles use tracks as in a tank. The robot is steered by moving the tracks with
different speeds in the same or opposite direction. It offers stability because of large contact area of
track and ground.
Components of a Robot:

Robots are constructed with the following −


 Power Supply − The robots are powered by batteries, solar power, hydraulic, or pneumatic
power sources.
 Actuators − They convert energy into movement.
 Electric motors (AC/DC) − They are required for rotational movement.
 Pneumatic Air Muscles − They contract almost 40% when air is sucked in them.
 Muscle Wires − They contract by 5% when electric current is passed through them.
 Piezo Motors and Ultrasonic Motors − Best for industrial robots.
 Sensors − They provide knowledge of real time information on the task environment. Robots
are equipped with vision sensors to be to compute the depth in the environment. A tactile
sensor imitates the mechanical properties of touch receptors of human fingertips.

Computer Vision:

This is a technology of AI with which the robots can see. The computer vision plays vital role in the
domains of safety, security, health, access, and entertainment.
Computer vision automatically extracts, analyzes, and comprehends useful information from a single
image or an array of images. This process involves development of algorithms to accomplish
automatic visual comprehension.
Hardware of Computer Vision System
This involves −

 Power supply
 Image acquisition device such as camera
 A processor
 A software
 A display device for monitoring the system
 Accessories such as camera stands, cables, and connectors

Tasks of Computer Vision

 OCR − In the domain of computers, Optical Character Reader, a software to convert scanned
documents into editable text, which accompanies a scanner.
 Face Detection − Many state-of-the-art cameras come with this feature, which enables to read
the face and take the picture of that perfect expression. It is used to let a user access the
software on correct match.
 Object Recognition − They are installed in supermarkets, cameras, high-end cars such as
BMW, GM, and Volvo.
 Estimating Position − It is estimating position of an object with respect to camera as in
position of tumor in human’s body.

Application Domains of Computer Vision

 Agriculture
 Autonomous vehicles
 Biometrics
 Character recognition
 Forensics, security, and surveillance
 Industrial quality inspection
 Face recognition
 Gesture analysis
 Geoscience
 Medical imagery
 Pollution monitoring
 Process control
 Remote sensing
 Robotics
 Transport
Applications of Robotics:

The robotics has been instrumental in the various domains such as −


 Industries − Robots are used for handling material, cutting, welding, color coating, drilling,
polishing, etc.
 Military − Autonomous robots can reach inaccessible and hazardous zones during war. A
robot named Daksh, developed by Defense Research and Development Organization
(DRDO), is in function to destroy life-threatening objects safely.
 Medicine − The robots are capable of carrying out hundreds of clinical tests simultaneously,
rehabilitating permanently disabled people, and performing complex surgeries such as brain
tumors.
 Exploration − The robot rock climbers used for space exploration, underwater drones used
for ocean exploration are to name a few.
 Entertainment − Disney’s engineers have created hundreds of robots for movie making.
UNIT-2 Game Playing in Artificial Intelligence
Game Playing is an important domain of artificial intelligence. Games don’t require much
knowledge; the only knowledge we need to provide is the rules, legal moves and the conditions
of winning or losing the game.

Both players try to win the game. So, both of them try to make the best move possible at each
turn. Searching techniques like BFS(Breadth First Search) are not accurate for this as the
branching factor is very high, so searching will take a lot of time. So, we need another search
procedures that improve –

• Generate procedure so that only good moves are generated.


• Test procedure so that the best move can be explored first.

The most common search technique in game playing is Minimax search procedure. It is depth-
first depth-limited search procedure. It is used for games like chess and tic-tac-toe.

2.1 MIN-MAX Search

Games have always been an important application area for heuristic algorithms. In playing
games whose state space may be exhaustively delineated, the primary difficulty is in accounting
for the actions of the opponent. This can be handled easily by assuming that the opponent uses
the same knowledge of the state space as us and applies that knowledge in a consistent effort to
win the game. Minmax implements game search under referred to as MIN and MAX.

The min max search procedure is a depth first, depth limited search procedure. The idea is to
start at the current position and use the plausible move generator to generate the set of possible
successor positions. To decide one move, it explores the possibilities of winning by looking
ahead to more than one step. This is called a ply. Thus in a two ply search, to decide the current
move, game tree would be explored two levels farther.

Consider the below example

Figure Tree showing two ply search


In this tree, node A represents current state of any game and nodes B, C and D represent three
possible valid moves from state A. similarly E, F, G represents possible moves from B, H, I
from C and J, K, L, from D. to decide which move to be taken from A, the different possibilities
are explored to two next steps. 0, -3, 3, 4, 5, 6, -5, 0 represent the utility values of respective
move. They indicate goodness of a move. The utility value is back propagated to ancestor node,
according to situation whether it is max ply or min ply. As it is a two player game, the utility
value is alternatively maximized and minimized. Here as

the second player’s move is maximizing, so maximum value of all children of one node will be
back propagated to node. Thus, the nodes B, C, D, get the values 4, 5, 6 respectively. Again as
ply 1 is minimizing, so the minimum value out of these i.e. 4 is propagated to A. then from A
move will be taken to B.

MIN MAX procedure is straightforward recursive procedure that relies on two auxiliary
procedures that are specific to the game being played.

1. MOVEGEN (position, player): the move generator which returns a list of nodes representing
the moves that can be made by player in position. We may have 2 players namely PLAYER-
TWO in a chess problem.

2. STATIC (position, player): the static evaluation function, which returns a number representing
the goodness of position from the standpoint of player.

We assume that MIN MAX returns a structure containing both results and that we have two
functions, VALUE and PATH that extract the separate components. A function LAST PLY is
taken which is assumed to evaluate all of the factors and to return TRUE if the search should be
stopped at the current level and FALSE otherwise.

MIN MAX procedure takes three parameters like a board position, a current depth of the search
and the players to move. So the initial call to compute the best move from the position
CURRENT should be

MIN MAX (CURRENT, 0, PLAYER-ONE)

(If player is to move)

Or

MIN MAX (CURRENT, 0, PLAYER-TWO)


(If player two is to move)
Let us follow the algorithm of MIN MAX
Algorithm: MINMAX (position, depth, player)

1. If LAST PLY (position, depth)


Then RETURN VALUE =
STATIC (position, player)
PATH = nil.
2. Else, generate one more ply of the tree by calling the function MOVE_GEN
(position, player) and set SUCCESORS to the list it returns.
3. If SUCESSORS is empty,
THEN no moves to be made
RETURN the same structure that would have been returned if LAST_PLY had returned
TRUE.
4. If SUCCESORS is not empty,
THEN examine each element in turn and keep track of the best one.
5. After examining all the nodes,
RETURN VALUE = BEST- SCORE
PATH = BEST- PATH

When the initial call to MIN MAX returns, the best move from CURRENT is the first
element in the PATH.
2.2 Alpha- Beta (α-β) Pruning
When a number of states of a game increase and it cannot be predicted about the states, then we
can use the method pruning. Pruning is a method which is used to reduce the no. of states in a
game. Alpha- beta is one such pruning technique. The problem with minmax search is that the
number of game states it has to examine is exponential in the number of moves. Unfortunately
we cannot eliminate the exponent, but we can effectively cut it in half. Alpha-beta pruning is
one of the solutions to the problem of minmax search tree. When α-β pruning is applied to a
standard minmax tree, it returns the same move as minmax would, but prunes away branches that
cannot possibly influence the final decision.

The idea of alpha beta pruning is very simple. Alpha beta search proceeds in a depth first fashion
rather than searching the entire space. Generally two values, called alpha and beta, are created
during the search. The alpha value is associated with MAX nodes and the beta value is with MIN
values. The value of alpha can never decrease; on the other hand the value of beta never
increases. Suppose the alpha value of A MAX node is 5. The MAX node then need not
consider any transmitted value less than or equal to 5 which is associated with any MIN node
below it. Alpha is the worst that MAX can score given that MIN will also do its best. Similarly,
if a MIN has a beta value of 5, it need not further consider any MAX node below it that has a
value of 6 or more.

The general principal is that: consider a node η somewhere in the search tree, such that player
has a choice of moving to that node. If player has a better choice К either at the parent node
of η or at any choice point further up, then η will never be reached in actual play. So once we
have found out enough about η (by examining some of its descendents) to reach this conclusion,
we can prune it.

We can also say that “α” is the value of the best choice we have found so far at any choice point
along the path for MAX. Similarly “β” is the value of the best choice we have found so far at
any choice point along the path for MIN. Consider the following example

Figure
Here at MIN ply, the best value from three nodes is - 4, 5, 0. These will be back propagated
towards root and a maximizing move 5 will be taken. Now the node E has the value 8 is far
more, then accepted as it is minimizing ply. So, further node E will not be explored. In the
situation when more plies are considered, whole sub tree below E will be pruned. Similarly if
α=0, β=7, all the nodes and related sub trees having value less than 0 at maximizing ply and
more than 7 at minimizing ply will be pruned.
Alpha beta search updates the value of α and β as it goes along and prunes the remaining
branches at a node as soon as the value of the current node is known to be worse than the current
α and β value for MAX or MIN respectively. The effectiveness of alpha- beta pruning is highly
dependent on the order in which the successors are examined suppose in a search tree the
branching factor is x and depth d. the α-β search needs examining only xd/2 nodes to pick up
best move, instead of xd for MINMAX.

2.3 The water jug problem :


There are two jugs called four and three ; four holds a maximum of four
gallons and three a maximum of three gallons. How can we get 2 gallons in the jug four. The
state space is a set of ordered pairs giving the number of gallons in the pair of jugs at any time ie
(four, three) where four = 0, 1, 2, 3, 4 and three = 0, 1, 2, 3. The start state is (0,0) and the goal
state is (2,n) where n is a don't care but is limited to three holding from 0 to 3 gallons. The major
production rules for solving this problem are shown below:

Initial condition goal comment


1 (four,three) if four < 4 (4,three) fill four from tap
2 (four,three) if three< 3 (four,3) fill three from tap
3 (four,three) If four > 0 (0,three) empty four into drain
4 (four,three) if three > 0 (four,0) empty three into drain
5 (four,three) if four+three<4 (four+three,0) empty three into four
6 (four,three) if four+three<3 (0,four+three) empty four into three
7 (0,three) If three>0 (three,0) empty three into four
8 (four,0) if four>0 (0,four) empty four into three
9 (0,2) (2,0) empty three into four
10 (2,0) (0,2) empty four into three
11 (four,three) if four<4 (4,three-diff) pour diff, 4-four, into four from three
12 (three,four) if three<3 (four-diff,3) pour diff, 3-three, into three from four and
a solution is given below Jug four, jug three rule applied
00
032
307
332
4 2 11
023
2 0 10
2.4 Chess Problem
Definition:
It is a normal chess game. In a chess problem, the start is the initial configuration of chessboard.
The final state is the any board configuration, which is a winning position for any player. There
may be multiple final positions and each board configuration can be thought of as representing a
state of the game. Whenever any player moves any piece, it leads to different state of game.
Procedure:

Figure

The above figure shows a 3x3 chessboard with each square labeled with integers 1 to 9. We
simply enumerate the alternative moves rather than developing a general move operator because
of the reduced size of the problem. Using a predicate called move in predicate calculus, whose
parameters are the starting and ending squares, we have described the legal moves on the board.
For example, move (1, 8) takes the knight from the upper left-hand corner to the middle of the
bottom row. While playing Chess, a knight can move two squares either horizontally or
vertically followed by one square in an orthogonal direction as long as it does not move off the
board.

The all possible moves of figure are as follows.


Move (1, 8) move (6, 1)
Move (1, 6) move (6, 7)
Move (2, 9) move (7, 2)
Move (2, 7) move (7, 6)
Move (3, 4) move (8, 3)
Move (3, 8) move (8, 1)
Move (4, 1) move (9, 2)
Move (4, 3) move (9, 4)

The above predicates of the Chess Problem form the knowledge base for this problem. An
unification algorithm is used to access the knowledge base. Suppose we need to find the
positions to which the knight can move from a particular location, square 2. The goal move (z, x)
unifies with two different predicates in the knowledge base, with the substitutions {7/x} and
{9/x}. Given the goal move (2, 3), the responsible is failure, because no move (2, 3) exists in
the knowledge base.

Comments:
_ In this game a lots of production rules are applied for each move of the square on the
chessboard.
_ A lots of searching are required in this game.
_ Implementation of algorithm in the knowledge base is very important.
Puzzles(Tiles) Problem

Definition:
“It has set off a 3x3 board having 9 block spaces out of which 8 blocks having tiles bearing
number from 1 to 8. One space is left blank. The tile adjacent to blank space can move into it.
We have to arrange the tiles in a sequence for getting the goal state”.

Procedure:
The 8-puzzle problem belongs to the category of “sliding block puzzle” type of problem. The 8-
puzzle is a square tray in which eight square tiles are placed. The remaining ninth square is
uncovered. Each tile in the tray has a number on it. A tile that is adjacent to blank space can be
slide into that space. The game consists of a starting position and a specified goal position. The
goal is to transform the starting position into the goal position by sliding the tiles around. The
control mechanisms for an 8-puzzle solver must keep track of the order in which operations are
performed, so that the operations can be undone one at a time if necessary. The objective of the
puzzles is to find a sequence of tile movements that leads from a starting configuration to a goal
configuration such as two situations given below.

Figure (Starting State) (Goal State)

The state of 8-puzzle is the different permutation of tiles within the frame. The operations are the
permissible moves up, down, left, right. Here at each step of the problem a function f(x) will be
defined which is the combination of g(x) and h(x).
i.e. F(x)=g(x) + h (x)
Where g _x_: how many steps in the problem you have already done or the current state from the
initial state.
h _x_: Number of ways through which you can reach at the goal state from the current state or
Or
F(x)=g(x) + h (x)
h _x_is the heuristic estimator that compares the current state with the goal state note down how
many states are displaced from the initial or the current state. After calculating the f (x) value at
each step finally take the smallest f (x) value at every step and choose that as the next current
state to get the goal state.

Let us take an example.

Figure (Initial State) (Goal State)


Step1:
f _x_is the step required to reach at the goal state from the initial state. So in the tray either 6 or 8
can change their portions to fill the empty position. So there will be two possible current states
namely B and C. The f (x) value of B is 6 and that of C is 4. As 4 is the minimum, so take C as
the current state to the next state.

Step 2:
In this step, from the tray C three states can be drawn. The empty position will contain either 5 or
3 or 6. So for three different values three different states can be obtained. Then calculate each of
their f (x) and

take the minimum one.


Here the state F has the minimum value i.e. 4 and hence take that as the next current state.

Step 3:
The tray F can have 4 different states as the empty positions can be filled with b4 values i.e.2, 4,
5, 8.
Step 4:
In the step-3 the tray I has the smallest f (n) value. The tray I can be implemented in 3 different
states because the empty position can be filled by the members like 7, 8, 6.

Hence, we reached at the goal state after few changes of tiles in different positions of the trays.
Comments:
This problem requires a lot of space for saving the different trays.
Time complexity is more than that of other problems.
The user has to be very careful about the shifting of tiles in the trays.
Very complex puzzle games can be solved by this technique.

You might also like