AI Notes
AI Notes
Theory
Teaching-Learning-Evaluation Scheme
Semester –VII (Final Year) Proposed Scheme w.e.f. July – 2023
Knowledge representation issues, Representation & mapping, Approaches to knowledge representation, Issues in knowledge
representation. Using predicate logic: Representing simple fact in logic, Representing instant & ISA relationship, Computable functions
& predicates, Resolution, Natural deduction. Representing knowledge using rules: Procedural verses declarative knowledge, Logic
programming, Forward verses backward reasoning, Matching, Control knowledge.
Reference Books:
1. Peter Norvig, Artificial Intelligence: A Modern Approach, Third Edition.
2. Ivan Bratko, Prolog Programming for Artificial Intelligence, Addison-Wesley.
COURSE CURRICULUM MAPPING WITH MOOC PLATFORM NPTEL
Name of Subject Name of
Relev ance
Sr. as per Course Code Semester SWAYAM/ NPTEL Institute Duration of
%
Curriculum Course And Web Link offering course Course
No.
https://nptel.ac.in/courses/106/ IIT, Madras
48 Hrs
106/106106126/
It is a branch of Computer Science that pursues creating the computers or machines as intelligent
as human beings. AI is the study of how to make computers do things which at the moment people do
better.
It is the science and engineering of making intelligent machines, especially intelligent computer
programs. Artificial Intelligence (AI) is a branch of Science which deals with helping machines finding
solutions to complex problems in a more human-like fashion
It is related to the similar task of using computers to understand human intelligence, but AI does
not have to confine itself to methods that are biologically observable.
Artificial Intelligence is the study of how to make computers do things, which, at the
moment, people do better.
According to the father of Artificial Intelligence, John McCarthy, it is “The science and
engineering of making intelligent machines, especially intelligent computer programs”.
AI is the study of the mental faculties through the use of computational models.
AI program will demonstrate a high level of intelligence to a degree that equals or exceeds the intelligence
required of a human in performing some task.
AI is unique, sharing borders with Mathematics, Computer Science, Philosophy, Psychology, Biology,
Cognitive Science and many others.
Although there is no clear definition of AI or even Intelligence, it can be described as an attempt to build
machines that like humans can think and act, able to learn and use knowledge to solve problems on their own.
The Foundations of Artificial Intelligence
To create the AI first we should know that how intelligence is composed, so the Intelligence is
an intangible part of our brain which is a combination of Reasoning, learning, problem-solving
perception, language understanding, etc.
To achieve the above factors for a machine or software Artificial Intelligence requires the
following discipline:
Here the brief history of discipline that contributes ideas, viewpoints, and techniques to AI is
provided. Like any history, this one is forced to concentrate on small number of people, events
and ideas and to ignore others that also were important.
1.2.1 Philosophy
1.2.2 Mathematics
1.2.3 Economics
1.2.4 Neuroscience
1.2.5 Psychology
1.2.6 Computer Engineering
1.2.7 Control Theory and cybernetics
1.2.8 Linguistics
1.2.1: Philosophy:
Can formal rules be used to draw valid conclusions?
How does the mind arise from a physical brain?
Where does knowledge come from?
How does knowledge lead to action.
Aristotle(384-322 B.C) was the first to formulate a precise set of laws governing the rational
part of mind.
Thomas Hobbes(1588-1679) Proposed that reasoning was like numerical computation, that we
add and subtract in our silent thoughts.
Rene Descarts (1596-1650) Developed dualistic theory of mind and matter. Descartes
attempt to demonstrate the existence of god and the distinction the human soul and body.
The empiricism movement , starting with Francis Bacons(1561-1626).
The confirmation theory of Camap and Carl Hempel (1905-1997) attempted to analyze the
acquisition of knowledge from experience.
Camap’s book the logical structure of the world (1928) defined an explicit computational
procedure for extracting knowledge from elementary experiences. It was probably the first
theory of mind as a computational process.
1.2.2 Mathematics :
What are the formal rules to draw valid conclusions
What can be computed
How do we reason with uncertain information
George Boole(1815-64) , Who worked out the details of propositional, or Boolean logic .
In 1879, Gottlob Frege(1848-1925) extended Boole’s logic to include and relations, creating
the first ordered logic that is used today.
The first Nontrivial algorithm is thought to be Euclid’s algorithm greatest common divisor.
Besides logic and computational, the third great contribution of mathematics to AI is the
PROBABILITY theory . The italian Gerolamo Cardano(1501-76) first framed the idea of
probability , describing it in terms of the possible outcomes of gaming events.
Thomas Bayes (1702-61) proposed a rule for updating probabilities in the light to new
evidence. Bayes rule underlies most modern approaches to uncertain reasoning in AI systems.
1.2.3 Economics :
How should we make decisions so as to maximize Payoff.
How should we do this when others may not go along.
How should we do this when the payoff may be far in the future
The science of Economics got its start in 1776, when Scottish philosopher Adam Smith(1723-
90) published as inquiry into the nature of causes of the wealth of nations.
Decision theory which combines probability theory with utility, provides a formal and complete
framework for decisions made under uncertainty.
Von Neumann and Moregenstern’s development of game theory includes the surprising result
that, for some games, a rational agent should adopt policies that are randomize . Unlike
decision theory, game theory does not offer an unambiguous prescription for selecting
actions.
1.2.4 Neuroscience :
How do brains process Information?
For AI to get succeed, two things needed , Intelligent and Artifact. The computer has been
the artifact of choice.
The first operational computer was the electromechanical Heath Robinson built in 1940 by
Alan Turing and team.
The first operational programmable computer was Z-3, by Koarad Zuse in1941 in Germany.
The first electronic computer ABC was assembled by John Atansoff & his student Clifford
Berry 1940-42.
The first programmable machine was a loom , devised in 1805 byJoseph Marie Jacquard(1752-
1834) that used punched cards to store instructions for the pattern to be Woven.
1.2.7 Control Theory and cybernetics :
How can artifact operate under their own control.?
Ktesibios of Aleandria built the firsts self- controlling machine: a water clock with a regulator
that maintained a constant flow rate. This invention changed the definition of what an
artifact could do.
Modern control theory, especially the branch known as stochastic optimal control, has as its
goal the design of system that maximize an objective function over time. This roughly
OBJECTIVE FUNCTION matches our view of AI : design system that behave optimally.
1.2.8 Linguistics:
How does language relate to thoughts?
In 1957 B.F. Skinner published verbal behavior. This was a comprehensive detailed account of
the behaviorist approach to language learning, written by foremost expert in the field.
Noam Chomesky who had published book on his own theory syntactic structures. He pointed
out that the behaviorist theory did not address the notion of creativity in language.
Modern linguistics & AI then were born at about the same time & grew up together, intersecting
in hybrid field called computational linguistics or NLP.
Artificial Intelligence is not a new word and not a new technology for researchers. This
technology is much older than you would imagine. Even there are the myths of Mechanical men
in Ancient Greek and Egyptian Myths. Following are some milestones in the history of AI which
defines the journey from the AI generation to till date development.
History of Artificial Intelligence
Artificial Intelligence is not a new word and not a new technology for researchers. This
technology is much older than you would imagine. Even there are the myths of Mechanical men
in Ancient Greek and Egyptian Myths. Following are some milestones in the history of AI which
defines the journey from the AI generation to till date development.
Maturation of Artificial Intelligence (1943-1952)
o Year 1943: The first work which is now recognized as AI was done by Warren McCulloch
and Walter pits in 1943. They proposed a model of artificial neurons.
o Year 1949: Donald Hebb demonstrated an updating rule for modifying the connection
strength between neurons. His rule is now called Hebbian learning.
o Year 1950: The Alan Turing who was an English mathematician and pioneered Machine
learning in 1950. Alan Turing publishes "Computing Machinery and Intelligence" in which
he proposed a test. The test can check the machine's ability to exhibit intelligent behavior
equivalent to human intelligence, called a Turing test.
A boom of AI (1980-1987)
o Year 1980: After AI winter duration, AI came back with "Expert System". Expert
systems were programmed that emulate the decision-making ability of a human expert.
In the Year 1980, the first national conference of the American Association of Artificial
Intelligence was held at Stanford University.
I. Robotic vehicles:
A driverless robotic car named STANLEY sped through the rough terrain of the Mojave
dessert at 22 mph, finishing the 132-mile course first to win the 2005 DARPA Grand
Challenge.
STANLEY is a Volkswagen Touareg outfitted with cameras, radar,and laser rangefinders to
sense the environment and onboard software to command the steering, braking, and
acceleration (Thrun, 2006).
The following year CMU’s BOSS won the Urban Challenge, safely driving in traffic through
the streets of a closed Air Force base, obeying traffic rules and avoiding pedestrians and
other vehicles.
V. Spam fighting:
Each day, learning algorithms classify over a billion messages as spam, saving the recipient
from having to waste time deleting what, for many users, could comprise 80% or 90% of all
messages, if not classified away by algorithms. Because the spammers are continually updating
their tactics, it is difficult for a static programmed approach to keep up, and learning
algorithms work best (Sahami et al., 1998; Goodman and Heckerman, 2004).
VII. Robotics:
The iRobot Corporation has sold over two million Roomba robotic vacuum cleaners for home
use. The company also deploys the more rugged PackBot to Iraq and Afghanistan, where it is
used to handle hazardous materials, clear explosives, and identify the location of snipers
An AI system can be defined as the study of the rational agent and its environment. The agents
sense the environment through sensors and act on their environment through actuators. An AI
agent can have mental properties such as knowledge, belief, intention, etc.
What is an Agent?
An agent can be anything that perceiveits environment through sensors and act upon that
environment through actuators. An Agent runs in the cycle of perceiving, thinking, and acting.
An agent can be:
o Human-Agent: A human agent has eyes, ears, and other organs which work for sensors and
hand, legs, vocal tract work for actuators.
o Robotic Agent: A robotic agent can have cameras, infrared range finder, NLP for sensors
and various motors for actuators.
o Software Agent: Software agent can have keystrokes, file contents as sensory input and
act on those inputs and display output on the screen.
Hence the world around us is full of agents such as thermostat, cellphone, camera, and
even we are also agents.
Before moving forward, we should first know about sensors, effectors, and actuators.
Sensor: Sensor is a device which detects the change in the environment and sends the
information to other electronic devices. An agent observes its environment through sensors.
Actuators: Actuators are the component of machines that converts energy into motion. The
actuators are only responsible for moving and controlling a system. An actuator can be an
electric motor, gears, rails, etc.
Effectors: Effectors are the devices which affect the environment. Effectors can be legs,
wheels, arms, fingers, wings, fins, and display screen.
Agent Environment in AI
An environment is everything in the world which surrounds the agent, but it is not a part of an
agent itself. An environment can be described as a situation in which an agent is present.
The environment is where agent lives, operate and provide the agent with something to sense
and act upon it. An environment is mostly said to be non-feministic.
Intelligent Agents:
An intelligent agent is an autonomous entity which act upon an environment using sensors and
actuators for achieving goals. An intelligent agent may learn from the environment to achieve
their goals. A thermostat is an example of an intelligent agent.
A rational agent is an agent which has clear preference, models uncertainty, and acts in a way
to maximize its performance measure with all possible actions.
A rational agent is said to perform the right things. AI is about creating rational agents to use
for game theory and decision theory for various real-world scenarios.
For an AI agent, the rational action is most important because in AI reinforcement learning
algorithm, for each best possible action, agent gets the positive reward and for each wrong
action, an agent gets a negative reward.
Rationality:
The rationality of an agent is measured by its performance measure. Rationality can be judged
on the basis of following points:
The task of AI is to design an agent program which implements the agent function. The
structure of an intelligent agent is a combination of architecture and agent program. It can be
viewed as:
Following are the main three terms involved in the structure of an AI agent:
PEAS is a type of model on which an AI agent works upon. When we define an AI agent or
rational agent, then we can group its properties under PEAS representation model. It is made
up of four words:
o P: Performance measure
o E: Environment
o A: Actuators
o S: Sensors
Here performance measure is the objective for the success of an agent's behavior.
Agents can be grouped into five classes based on their degree of perceived intelligence and
capability. All these agents can improve their performance and generate better action over the
time. These are given below:
Hence, learning agents are able to learn, analyze performance, and look for new ways to improve the
performance.
UNIT 2 - Problem Solving
❖ Solving Problems by Searching :
➢ Search algorithms are one of the most important areas of Artificial Intelligence.
➢ In Artificial Intelligence, Search techniques are universal problem-solving
methods.
➢ Rational agents or Problem-solving agents in AI mostly used these search
strategies or algorithms to solve a specific problem and provide the best result.
➢ Problem-solving agents are the goal-based agents and use atomic representation.
Completeness: A search algorithm is said to be complete if it guarantees to return a solution if at least any solution exists for
any random input.
Optimality: If a solution found for an algorithm is guaranteed to be the best solution (lowest path cost) among all other
solutions, then such a solution for is said to be an optimal solution.
Time Complexity: Time complexity is a measure of time for an algorithm to complete its task.
Space Complexity: It is the maximum storage space required at any point during the search, as the complexity of the problem.
➢ The uninformed search does not contain any domain knowledge such as closeness, the
location of the goal.
➢ It operates in a brute-force way as it only includes information about how to traverse the
tree and how to identify leaf and goal nodes.
➢ Uninformed search applies a way in which search tree is searched without any information
about the search space like initial state operators and test for the goal, so it is also called
blind search.
➢ It examines each node of the tree until it achieves the goal node.
o Breadth-first search
o Uniform cost search
o Depth-first search
o Iterative deepening depth-first search
o Bidirectional Search
Breadth-first Search:
➢ Breadth-first search is the most common search strategy for traversing a tree or
graph.
➢ This algorithm searches breadthwise in a tree or graph, so it is called breadth-
first search.
➢ BFS algorithm starts searching from the root node of the tree and expands all
successor node at the current level before moving to nodes of next level.
➢ The breadth-first search algorithm is an example of a general-graph search
algorithm.
➢ Breadth-first search implemented using FIFO queue data structure.
Advantages:
➢ BFS will provide a solution if any solution exists.
➢ If there are more than one solution for a given problem, then BFS will provide the
minimal solution which requires the least number of steps.
➢ Completeness: BFS is complete, which means if the shallowest goal node is at
some finite depth, then BFS will find a solution.
➢ Optimality: BFS is optimal if path cost is a non-decreasing function of the depth
of the node.
Disadvantages:
➢ It requires lots of memory since each level of the tree must be saved into
memory to expand the next level.
➢ BFS needs lots of time if the solution is far away from the root node.
Breadth first search algorithm:
o Step 1: Place the starting node into the Queue.
o Step 2: If the Node is empty, return failure and Stop.
o Step 3: If the Node is Goal state, return success and Stop search.
o Step 4: Else remove first node in the queue & Expand the successors of node then add
children of the node into the queue.
o Step 5: Return to Step 2.
Example:
➢ In the below tree structure, we have shown the traversing of the tree using BFS
algorithm from the root node S to goal node K.
➢ BFS search algorithm traverse in layers, so it will follow the path which is shown
by the dotted arrow, and the traversed path will be:
➢ S---> A--->B---->C--->D---->G--->H--->E---->F---->I---->K
Depth-first Search
➢ Depth-first search isa recursive algorithm for traversing a tree or graph data
structure.
➢ It is called the depth-first search because it starts from the root node and follows
each path to its greatest depth node before moving to the next path.
➢ DFS uses a stack data structure for its implementation.
➢ The process of the DFS algorithm is similar to the BFS algorithm.
➢ Backtracking is an algorithm technique for finding all possible solutions using
recursion.
Advantage:
➢ DFS requires very less memory as it only needs to store a stack of the nodes on the
path from root node to the current node.
➢ It takes less time to reach to the goal node than BFS algorithm (if it traverses in the
right path).
Disadvantage:
➢ There is the possibility that many states keep re-occurring, and there is no guarantee
of finding the solution.
➢ DFS algorithm goes for deep down searching and sometime it may go to the infinite
loop.
➢ Completeness: DFS search algorithm is complete within finite state space as it will
expand every node within a limited search tree.
➢ Optimal: DFS search algorithm is non-optimal, as it may generate a large number of
steps or high cost to reach to the goal node.
Depth first search algorithm:
o Step 1: Place the starting node into the Stack.
o Step 2: If the Node is empty, return failure and Stop.
o Step 3: If the Node is Goal state, return success and Stop search.
o Step 4: Else remove Last node in the Stack & Expand the successors of node then add
children of the node into the Stack.
o Step 5: Return to Step 2.
Example:
➢ In the below search tree, we have shown the flow of depth-first search, and it
will follow the order as:
➢ Root node--->Left node ----> right node.
➢ It will start searching from root node S, and traverse A, then B, then D and E,
after traversing E, it will backtrack the tree as E has no other successor and still
goal node is not found. After backtracking it will traverse node C and then G, and
here it will terminate as it found goal node.
❖ Informed (Heuristic) Search Strategies
Advantages:
➢ Best first search can switch between BFS and DFS by gaining the advantages of both the
algorithms.
➢ This algorithm is more efficient than BFS and DFS algorithms.
Disadvantages:
➢ It can behave as an unguided depth-first search in the worst-case scenario.
➢ It can get stuck in a loop as DFS.
➢ Complete: Greedy best-first search is also incomplete, even if the given state space is
finite.
➢ Optimal: Greedy best first search algorithm is not optimal.
Best first search algorithm:
o Step 1: Place the starting node into the OPEN list.
o Step 2: If the OPEN list is empty, Stop and return failure.
o Step 3: Remove the node n, from the OPEN list which has the lowest value of h(n), and places it in the
CLOSED list.
o Step 4: Expand the node n, and generate the successors of node n.
o Step 5: Check each successor of node n, and find whether any node is a goal node or not. If any
successor node is goal node, then return success and terminate the search, else proceed to Step 6.
o Step 6: For each successor node, algorithm checks for evaluation function f(n), and then check if the
node has been in either OPEN or CLOSED list. If the node has not been in both list, then add it to the
OPEN list.
o Step 7: Return to Step 2.
Example:
Consider the below search problem, and we will traverse it using greedy best-first search.
At each iteration,each node is expanded using evaluation function f(n)=h(n) , which is given in the below table.
In this search example, we are using two lists which are OPEN and CLOSED Lists.
Following are the iteration for traversing the above example.
In constraint satisfaction, domains are the areas wherein parameters were located after the restrictions
that are particular to the task.
Those three components make up a constraint satisfaction technique in its entirety. The pair "scope, rel"
makes up the number of something like the requirement.
The scope is a tuple of variables that contribute to the restriction, as well as rel is indeed a relationship that
contains a list of possible solutions for the parameters should assume in order to meet the restrictions of
something like the issue.
For a constraint satisfaction problem (CSP), the following conditions must be met:
o States area
o fundamental idea while behind remedy.
The definition of a state in phase space involves giving values to any or all of the parameters, like as
1. Consistent or Legal Assignment: A task is referred to as consistent or legal if it complies with all laws
and regulations.
2. Complete Assignment: An assignment in which each variable has a number associated to it and that the
CSP solution is continuous. One such task is referred to as a completed task.
3. A partial assignment is one that just gives some of the variables values. Projects of this nature are
referred to as incomplete assignment.
The parameters utilize one of the two types of domains listed below:
o Discrete Domain: This limitless area allows for the existence of a single state with numerous variables.
For instance, every parameter may receive a endless number of beginning states.
o It is a finite domain with continuous phases that really can describe just one area for just one particular
variable. Another name for it is constant area.
Basically, there are three different categories of limitations in regard towards the parameters:
o Unary restrictions are the easiest kind of restrictions because they only limit the value of one variable.
o Binary resource limits: These restrictions connect two parameters. A value between x1 and x3 can be
found in a variable named x2.
o Global Resource limits: This kind of restriction includes a unrestricted amount of variables.
The main kinds of restrictions are resolved using certain kinds of resolution methodologies:
o In linear programming, when every parameter carrying an integer value only occurs in linear equation,
linear constraints are frequently utilised.
o Non-linear Constraints: With non-linear programming, when each variable (an integer value) exists in a
non-linear form, several types of restrictions were utilised.
❖ Local Search Algorithm / Method / Technique
Systematic search (uninformed/ informed) expands nodes
1
systematically
2 Many optimizations problem, the path to the goal is irrelevant.
3 Have knowledge of local domain
Used for pure optimization problem….
……. Optimization problem, solving problem with mini/max number of
4 steps / cost/ steps
------Pure Optimization problem, is one where all the nodes can give a
solution
5 Used for those which required only solution instead of path cost
6 Only focus on solution, path cost does not matter.
7 To find reasonable solution in large infinite spaces
Find out best solution out of all according to the heuristic
function/objective function
…….heuristic cost function is the path cost for reaching the goal state
8
from current state
…….. heuristic cost function is function where vale is either max or
minimized
9 Provides approximate solution not exact solution
Traversing on single direction current state rather than multiple path
10
& move only tpo neighbors of that node
11 Required less memory ( save only current state)
12 No Backtracking
13 Not Systematic
14 Incomplete
15 No Optimal
16 Examples: Hill Climbing Algorithms , Genetic Algorithm
Hill Climbing Algorithm
Step 1 : Define / Evaluate initial state as current state
If new state is better than current state (neighbor value >= current state) then assigns new state as
current state
If new state is not better than current state (neighbor value =< current state) then returns current
state & goto step 2
Step 5 : Exit
❖Adversarial Search
o A single agent that aims to find the solution which often expressed in the form of a
sequence of actions.
o But, there might be some situations where more than one agent is searching for the
solution in the same search space, and this situation usually occurs in game playing.
o The environment with more than one agent is termed as multi-agent environment, in which
each agent is an opponent of other agent and playing against each other. Each agent needs
to consider the action of other agent and effect of that action on their performance.
o So, Searches in which two or more players with conflicting goals are trying to explore
the same search space for the solution, are called adversarial searches, often known
as Games.
o Games are modeled as a Search problem and heuristic evaluation function, and these are
the two main factors which help to model and solve games in AI.
A game can be defined as a type of search in AI which can be formalized of the following elements:
α>=β
Key points about alpha-beta pruning:
o The Max player will only update the value of alpha.
o The Min player will only update the value of beta.
o While backtracking the tree, the node values will be passed to upper nodes instead of values of alpha
and beta.
o We will only pass the alpha, beta values to the child nodes.
o Worst ordering: In some cases, alpha-beta pruning algorithm does not prune any of the leaves of the tree, and works
exactly as minimax algorithm. In this case, it also consumes more time because of alpha-beta factors, such a move of
pruning is called worst ordering. In this case, the best move occurs on the right side of the tree. The time complexity
for such an order is O(bm).
o Ideal ordering: The ideal ordering for alpha-beta pruning occurs when lots of pruning happens in the tree, and best
moves occur at the left side of the tree. We apply DFS hence it first search left of the tree and go deep twice as
minimax algorithm in the same amount of time. Complexity in ideal ordering is O(bm/2).
Rules to find good ordering:
Let's take an example of two-player search tree to understand the working of Alpha-beta pruning
Step 1: At the first step the, Max player will start first move from node A where α= -∞ and β= +∞, these value of alpha and
beta passed down to node B where again α= -∞ and β= +∞, and Node B passes the same value to its child D.
Step 2: At Node D, the value of α will be calculated as its turn for Max. The value of α is compared with firstly 2 and then 3,
and the max (2, 3) = 3 will be the value of α at node D and node value will also 3.
Step 3: Now algorithm backtrack to node B, where the value of β will change as this is a turn of Min, Now β= +∞, will compare
with the available subsequent nodes value, i.e. min (∞, 3) = 3, hence at node B now α= -∞, and β= 3.
In the next step, algorithm traverse the next successor of Node B which is node E, and the values of α= -∞, and β= 3 will also
be passed.
Step 4: At node E, Max will take its turn, and the value of alpha will change. The current value of alpha will be compared with
5, so max (-∞, 5) = 5, hence at node E α= 5 and β= 3, where α>=β, so the right successor of E will be pruned, and algorithm will
not traverse it, and the value at node E will be 5.
Step 5: At next step, algorithm again backtrack the tree, from node B to node A. At node A, the value of alpha will be changed
the maximum available value is 3 as max (-∞, 3)= 3, and β= +∞, these two values now passes to right successor of A which is
Node C.
At node C, α=3 and β= +∞, and the same values will be passed on to node F.
Step 6: At node F, again the value of α will be compared with left child which is 0, and max(3,0)= 3, and then compared with
right child which is 1, and max(3,1)= 3 still α remains 3, but the node value of F will become 1.
Step 7: Node F returns the node value 1 to node C, at C α= 3 and β= +∞, here the value of beta will be changed, it will compare
with 1 so min (∞, 1) = 1. Now at C, α=3 and β= 1, and again it satisfies the condition α>=β, so the next child of C which is G will
be pruned, and the algorithm will not compute the entire sub-tree G.
Step 8: C now returns the value of 1 to A here the best value for A is max (3, 1) = 3. Following is the final game tree which is
the showing the nodes which are computed and nodes which has never computed. Hence the optimal value for the maximizer is
3 for this example.
UNIT 3 - Knowledge & Reasoning
Knowledge-Based Agent in Artificial intelligence
o An intelligent agent needs knowledge about the real world for taking decisions and reasoning to act efficiently.
o Knowledge-based agents are those agents who have the capability of maintaining an internal state of knowledge, reason
over that knowledge, update their knowledge after observations and take actions. These agents can represent the
world with some formal representation and act intelligently.
o Knowledge-based agents are composed of two main parts:
o Knowledge-base and
o Inference system.
The above diagram is representing a generalized architecture for a knowledge-based agent. The knowledge-based agent
(KBA) take input from the environment by perceiving the environment. The input is taken by the inference engine of the
agent and which also communicate with KB to decide as per the knowledge store in KB. The learning element of KBA regularly
updates the KB by learning new knowledge.
Knowledge base: Knowledge-base is a central component of a knowledge-based agent, it is also known as KB. It is a collection
of sentences (here 'sentence' is a technical term and it is not identical to sentence in English). These sentences are
expressed in a language which is called a knowledge representation language. The Knowledge-base of KBA stores fact about
the world.
Why use a knowledge base?
Knowledge-base is required for updating knowledge for an agent to learn with experiences and take action as per the
knowledge.
Following are three operations which are performed by KBA in order to show the intelligent behavior:
1. TELL: This operation tells the knowledge base what it perceives from the environment.
2. ASK: This operation asks the knowledge base what action it should perform.
3. Perform: It performs the selected action.
Inference system
Inference means deriving new sentences from old. Inference system allows us to add a new sentence to the knowledge base. A
sentence is a proposition about the world. Inference system applies logical rules to the KB to deduce new information.
The inference engine is the component of the intelligent system in artificial intelligence, which applies logical rules to the
knowledge base to infer new information from known facts. The first inference engine was part of the expert system. Inference
engine commonly proceeds in two modes, which are:
o Forward chaining
o Backward chaining
Forward Chaining
Forward chaining is also known as a forward deduction or forward reasoning method when using an inference engine. Forward chaining is a
form of reasoning which start with atomic sentences in the knowledge base and applies inference rules (Modus Ponens) in the forward
direction to extract more data until a goal is reached.
The Forward-chaining algorithm starts from known facts, triggers all rules whose premises are satisfied, and add their conclusion to the
known facts. This process repeats until the problem is solved.
Properties of Forward-Chaining:
o It is a down-up approach, as it moves from bottom to top.
o It is a process of making a conclusion based on known facts or data, by starting from the initial state and reaches the goal state.
o Forward-chaining approach is also called as data-driven as we reach to the goal using available data.
o Forward -chaining approach is commonly used in the expert system, such as CLIPS, business, and production rule systems.
Backward Chaining:
Backward-chaining is also known as a backward deduction or backward reasoning method when using an inference engine. A backward
chaining algorithm is a form of reasoning, which starts with the goal and works backward, chaining through rules to find known facts that
support the goal.
There are mainly four ways of knowledge representation which are given as follows:
1. Logical Representation
2. Semantic Network Representation
3. Frame Representation
4. Production Rules
1 Logical Representation
Logical representation is a language with some concrete rules which deals with propositions and has no ambiguity in
representation. Logical representation means drawing a conclusion based on various conditions. This representation lays down
some important communication rules. It consists of precisely defined syntax and semantics which supports the sound inference.
Each sentence can be translated into logics using syntax and semantics.
Syntax:
o Syntaxes are the rules which decide how we can construct legal sentences in the logic.
o It determines which symbol we can use in knowledge representation.
o How to write those symbols.
Semantics:
o Semantics are the rules by which we can interpret the sentence in the logic.
o Semantic also involves assigning a meaning to each sentence.
a. Propositional Logics
b. Predicate logics
Semantic networks are alternative of predicate logic for knowledge representation. In Semantic networks, we can represent
our knowledge in the form of graphical networks. This network consists of nodes representing objects and arcs which describe
the relationship between those objects. Semantic networks can categorize the object in different forms and can also link those
objects. Semantic networks are easy to understand and can be easily extended.
Example: Following are some statements which we need to represent in the form of nodes and arcs.
Statements:
a. Jerry is a cat.
b. Jerry is a mammal
c. Jerry is owned by Priya.
d. Jerry is brown colored.
e. All Mammals are animal.
In the above diagram, we have represented the different type of knowledge in the form of nodes and arcs. Each object is
connected with another object by some relation.
A frame is a record like structure which consists of a collection of attributes and its values to describe an entity in the world.
Frames are the AI data structure which divides knowledge into substructures by representing stereotypes situations. It
consists of a collection of slots and slot values. These slots may be of any type and sizes. Slots have names and values which
are called facets.
Facets: The various aspects of a slot is known as Facets. Facets are features of frames which enable us to put constraints on
the frames. Example: IF-NEEDED facts are called when data of any particular slot is needed. A frame may consist of any
number of slots, and a slot may include any number of facets and facets may have any number of values. A frame is also known
as slot-filter knowledge representation in artificial intelligence.
Frames are derived from semantic networks and later evolved into our modern-day classes and objects. A single frame is not
much useful. Frames system consist of a collection of frames which are connected. In the frame, knowledge about an object or
event can be stored together in the knowledge base. The frame is a type of technology which is widely used in various
applications including Natural language processing and machine visions.
Example: 1
Slots Filters
Year 1996
Page 1152
Production rules system consist of (condition, action) pairs which mean, "If condition then action". It has mainly three parts:
In production rules agent checks for the condition and if the condition exists then production rule fires and corresponding
action is carried out. The condition part of the rule determines which rule may be applied to a problem. And the action part
carries out the associated problem-solving steps. This complete process is called a recognize-act cycle.
The working memory contains the description of the current state of problems-solving and rule can write knowledge to the
working memory. This knowledge match and may fire other rules.
If there is a new situation (state) generates, then multiple production rules will be fired together, this is called conflict set.
In this situation, the agent needs to select a rule from these sets, and it is called a conflict resolution.
Example:
o IF (at bus stop AND bus arrives) THEN action (get into the bus)
o IF (on the bus AND paid AND empty seat) THEN action (sit down).
o IF (on bus AND unpaid) THEN action (pay charges).
o IF (bus arrives at destination) THEN action (get down from the bus).
Propositional logic (PL) is the simplest form of logic where all the statements are made by propositions. A proposition is a
declarative statement which is either true or false. It is a technique of knowledge representation in logical and mathematical
form.
Example:
1. a) It is Sunday.
2. b) The Sun rises from West (False proposition)
3. c) 3+3= 7(False proposition)
4. d) 5 is a prime number.
The syntax of propositional logic defines the allowable sentences for the knowledge representation. There are two types of
Propositions:
a. Atomic Propositions
b. Compound propositions
o Atomic Proposition: Atomic propositions are the simple propositions. It consists of a single proposition symbol. These
are the sentences which must be either true or false.
Example:
o Compound proposition: Compound propositions are constructed by combining simpler or atomic propositions, using
parenthesis and logical connectives.
Example:
Negation: A sentence such as ¬ P is called negation of P. A literal can be either Positive literal or negative literal.
P= Rohan is intelligent,
Q= Rohan is hardworking. → P∧ Q.
Disjunction: A sentence which has ∨ connective, such as P ∨ Q. is called disjunction, where P and Q are the propositions.
Implication: A sentence such as P → Q, is called an implication. Implications are also known as if-then rules. It can be
represented as
In the Propositional logic, we have seen that how to represent statements using propositional logic. But unfortunately, in
propositional logic, we can only represent the facts, which are either true or false. PL is not sufficient to represent the complex
sentences or natural language statements. The propositional logic has very limited expressive power.
To represent the above statements, PL logic is not sufficient, so we required some more powerful logic, such
as first-order logic.
First-Order logic:
o First-order logic is another way of knowledge representation in artificial intelligence. It is an extension to propositional
logic.
o FOL is sufficiently expressive to represent the natural language statements in a concise way.
o First-order logic is also known as Predicate logic or First-order predicate logic. First-order logic is a powerful language
that develops information about the objects in a more easy way and can also express the relationship between those
objects.
o First-order logic (like natural language) does not only assume that the world contains facts like propositional logic but
also assumes the following things in the world:
o Objects: A, B, people, numbers, colors, wars, theories, squares, pits, wumpus, ......
o Relations: It can be unary relation such as: red, round, is adjacent, or n-any relation such as: the sister of,
brother of, has color, comes between
o Function: Father of, best friend, third inning of, end of, ......
o As a natural language, first-order logic also has two main parts:
a. Syntax
b. Semantics
The syntax of FOL determines which collection of symbols is a logical expression in first-order logic. The basic syntactic
elements of first-order logic are symbols. We write statements in short-hand notation in FOL.
Variables x, y, z, a, b,....
Connectives ∧, ∨, ¬, ⇒, ⇔
Equality ==
Quantifier ∀, ∃
Quantifiers in First-order logic:
o A quantifier is a language element which generates quantification, and quantification
specifies the quantity of specimen in the universe of discourse.
o These are the symbols that permit to determine or identify the range and scope of the
variable in the logical expression.
Universal quantifier is a symbol of logical representation, which specifies that the statement within its range is true for
everything or every instance of a particular thing.
o For all x
o For each x
o For every x.
Example:
Let a variable x which refers to a cat so all x can be represented in UOD as below:
It will be read as: There are all x where x is a man who drink coffee.
Existential Quantifier:
Existential quantifiers are the type of quantifiers, which express that the statement within its scope is true for at least one
instance of something.
It is denoted by the logical operator ∃, which resembles as inverted E. When it is used with a predicate variable then it is called
as an existential quantifier.
If x is a variable, then existential quantifier will be ∃x or ∃(x). And it will be read as:
It will be read as: There are some x where x is a boy who is intelligent.
Properties of Quantifiers:
o The main connective for universal quantifier ∀ is implication →.
o The main connective for existential quantifier ∃ is and ∧.
o In universal quantifier, ∀x∀y is similar to ∀y∀x.
o In Existential quantifier, ∃x∃y is similar to ∃y∃x.
o ∃x∀y is not similar to ∀y∃x.
Natural deduction is a family of derivation systems with derivation rules designed to mimic
the way people reason deductively
Uncertainty:
we have learned knowledge representation using first-order logic and propositional logic with certainty, which means we were
sure about the predicates.
With this knowledge representation, we might write A→B, which means if A is true then B is true, but consider a situation
where we are not sure about whether A is true or not then we cannot express this statement, this situation is called uncertainty.
So to represent uncertain knowledge, where we are not sure about the predicates, we need uncertain reasoning or probabilistic
reasoning.
Causes of uncertainty:
Following are some leading causes of uncertainty to occur in the real world.
We use probability in probabilistic reasoning because it provides a way to handle the uncertainty that is the result of someone's
laziness and ignorance.
In the real world, there are lots of scenarios, where the certainty of something is not confirmed, such as "It will rain today,"
"behavior of someone for some situations," "
A match between two teams or two players." These are probable sentences for which we can assume that it will happen but not
sure about it, so here we use probabilistic reasoning.
Bayes' theorem is also known as Bayes' rule, Bayes' law, or Bayesian reasoning, which determines the probability of an event
with uncertain knowledge.
In probability theory, it relates the conditional probability and marginal probabilities of two random events.
Bayes' theorem was named after the British mathematician Thomas Bayes. The Bayesian inference is an application of Bayes'
theorem, which is fundamental to Bayesian statistics.
Bayes' theorem allows updating the probability prediction of an event by observing new information of the real world.
Bayes' theorem can be derived using product rule and conditional probability of event A with known event B:
The above equation (a) is called as Bayes' rule or Bayes' theorem. This equation is basic of most modern AI systems
for probabilistic inference.
Following are some applications of Bayes' theorem:
o It is used to calculate the next step of the robot when the already executed step is given.
o Bayes' theorem is helpful in weather forecasting.
o It can solve the Monty Hall problem.
Bayesian networks are probabilistic, because these networks are built from a probability distribution, and also use probability
theory for prediction and anomaly detection.
Real world applications are probabilistic in nature, and to represent the relationship between multiple events, we need a Bayesian
network. It can also be used in various tasks including prediction, anomaly detection, diagnostics, automated insight,
reasoning, time series prediction, and decision making under uncertainty.
Bayesian Network can be used for building models from data and experts opinions, and it consists of two parts:
The generalized form of Bayesian network that represents and solve decision problems under uncertain knowledge is known as
an Influence diagram.
A Bayesian network graph is made up of nodes and Arcs (directed links), where:
o Each node corresponds to the random variables, and a variable can be continuous or discrete.
o Arc or directed arrows represent the causal relationship or conditional probabilities between random variables. These
directed links or arrows connect the pair of nodes in the graph.
These links represent that one node directly influence the other node, and if there is no directed link that means that
nodes are independent with each other
o In the above diagram, A, B, C, and D are random variables represented by the nodes of the network graph.
o If we are considering node B, which is connected with node A by a directed arrow, then node A is called
the parent of Node B.
o Node C is independent of node A.
Dempster-Shafer theory
Dempster-Shafer Theory (DST) is a theory of evidence that has its roots in the work of Dempster and Shafer.
While traditional probability theory is limited to assigning probabilities to mutually exclusive single events, DST extends this
to sets of events in a finite discrete space.
This generalization allows DST to handle evidence associated with multiple possible events, enabling it to represent uncertainty
in a more meaningful way.
DST also provides a more flexible and precise approach to handling uncertain information without relying on additional
assumptions about events within an evidential set.
Where sufficient evidence is present to assign probabilities to single events, the Dempster-Shafer model can collapse to the
traditional probabilistic formulation
Therefore, Dempster-Shafer theory is a powerful tool for building AI systems that can handle complex uncertain scenarios
The utilization of Dempster Shafer theory in artificial intelligence empowers decision-making processes in the face of
uncertainty and enhances the robustness of AI systems.
Fuzzy sets & fuzzy logics
Fuzzy logic techniques are efficient in solving complex, ill-defined problems that are characterized by uncertainty of
environment and fuzziness of information.
Fuzzy logic allows handling uncertain and imprecise knowledge and provides a powerful framework for reasoning.
Fuzzy reasoning models are relevant to a wide variety of subject areas such as engineering, economics, psychology, sociology,
finance, and education.
In the literatures various fuzzy reasoning methods are purposed to process uncertain information and increase the efficiency
of the designed systems.
These fuzzy reasoning methods are mainly based on compositional rule, analogy and similarity, interpolation, and the concept
of distance.
The output of a Fuzzy Logic system is a fuzzy set, which is a set of membership degrees for each possible output value.
Fuzzy Logic is a mathematical method for representing vagueness and uncertainty in decision-making.
This system can work with any type of inputs whether it is imprecise, distorted or noisy input information.
Fuzzy logic comes with mathematical concepts of set theory and the reasoning of that is quite simple.
It provides a very efficient solution to complex problems in all fields of life as it resembles human reasoning and decision-
making.
Fuzzy Logic is used in a wide range of applications, such as control systems, image processing, natural language processing, medical
diagnosis, and artificial intelligence.
The fundamental concept of Fuzzy Logic is the membership function, which defines the degree of membership of an input value
to a certain set or category.
The membership function is a mapping from an input value to a membership degree between 0 and 1, where 0 represents non -
membership and 1 represents full membership.
Fuzzy Logic is implemented using Fuzzy Rules, which are if-then statements that express the relationship between input variables
and output variables in a fuzzy way.
The output of a Fuzzy Logic system is a fuzzy set, which is a set of membership degrees for each possible output value.
Fuzzy Logic is a mathematical method for representing vagueness and uncertainty in decision-making, it allows for partial truths,
and it is used in a wide range of applications.
It is based on the concept of membership function and the implementation is done using Fuzzy rules.
In the boolean system truth value, 1.0 represents the absolute truth value and 0.0 represents the absolute false value.
But in the fuzzy system, there is no logic for the absolute truth and absolute false value. But in fuzzy logic, there is an
intermediate value too present which is partially true and partially false.
Membership function
A graph that defines how each point in the input space is mapped to membership value between 0 and 1. Input space is often
referred to as the universe of discourse or universal set (u), which contains all the possible elements of concern in each
particular application.
ARCHITECTURE
Its Architecture contains four parts :
• RULE BASE: It contains the set of rules and the IF-THEN conditions provided by the experts to govern the
decision-making system, on the basis of linguistic information. Recent developments in fuzzy theory offer several
effective methods for the design and tuning of fuzzy controllers. Most of these developments reduce the number
of fuzzy rules.
• FUZZIFICATION: It is used to convert inputs i.e. crisp numbers into fuzzy sets. Crisp inputs are basically the
exact inputs measured by sensors and passed into the control system for processing, such as temperature,
pressure, rpm’s, etc.
• INFERENCE ENGINE: It determines the matching degree of the current fuzzy input with respect to each rule
and decides which rules are to be fired according to the input field. Next, the fired rules are combined to form
the control actions.
• DEFUZZIFICATION: It is used to convert the fuzzy sets obtained by the inference engine into a crisp value.
There are several defuzzification methods available and the best-suited one is used with a specific expert system
to reduce the error.
Planning: Overview, Components of a planning system, Goal stack planning, Hierarchical planning and other planning
techniques.
Artificial intelligence is an important technology in the future. Whether it is intelligent robots, self-driving cars, or smart cities, they will
all use different aspects of artificial intelligence!!! But Planning is very important to make any such AI project.
Even Planning is an important part of Artificial Intelligence which deals with the tasks and domains of a particular problem. Planning is
considered the logical side of acting.
Everything we humans do is with a definite goal in mind, and all our actions are oriented towards achieving our goal. Similarly, Planning is
also done for Artificial Intelligence.
For example, Planning is required to reach a particular destination. It is necessary to find the best route in Planning, but the tasks to be
done at a particular time and why they are done are also very important.
That is why Planning is considered the logical side of acting. In other words, Planning is about deciding the tasks to be performed by the
artificial intelligence system and the system's functioning under domain-independent conditions.
What is a Plan?
We require domain description, task specification, and goal description for any planning system. A plan is considered a sequence
of actions, and each action has its preconditions that must be satisfied before it can act and some effects that can be positive
or negative.
So, we have Forward State Space Planning (FSSP) and Backward State Space Planning (BSSP) at the basic level.
1. Forward State Space Planning (FSSP)
FSSP behaves in the same way as forwarding state-space search. It says that given an initial state S in any domain, we perform
some necessary actions and obtain a new state S' (which also contains some new terms), called a progression. It continues until
we reach the target position. Action should be taken in this matter.
BSSP behaves similarly to backward state-space search. In this, we move from the target state g to the sub-goal g, tracing
the previous action to achieve that goal. This process is called regression (going back to the previous goal or sub-goal). These
sub-goals should also be checked for consistency. The action should be relevant in this case.
So for an efficient planning system, we need to combine the features of FSSP and BSSP, which gives rise to target stack
planning.
What is planning in AI?
Planning in artificial intelligence is about decision-making actions performed by robots or computer programs to achieve a
specific goal.
Execution of the plan is about choosing a sequence of tasks with a high probability of accomplishing a specific task.
The start position and target position are shown in the following diagram.
Components of the planning system
o Choose the best rule to apply the next rule based on the best available guess.
o Apply the chosen rule to calculate the new problem condition.
o Find out when a solution has been found.
o Detect dead ends so they can be discarded and direct system effort in more useful directions.
o Find out when a near-perfect solution is found.
1. Start by pushing the original target onto the stack. Repeat this until the pile is empty. If the stack top is a mixed target,
push its unsatisfied sub-targets onto the stack.
2. If the stack top is a single unsatisfied target, replace it with action and push the action precondition to the stack to
satisfy the condition.
iii. If the stack top is an action, pop it off the stack, execute it and replace the knowledge base with the action's effect.
This Planning is used to set a goal stack and is included in the search space of all possible sub-goal orderings. It handles the
goal interactions by the interleaving method.
Non-linear Planning may be an optimal solution concerning planning length (depending on the search strategy used).
It takes a larger search space since all possible goal orderings are considered.
Algorithm
1. Choose
2. a goal 'g' from the goal set
3. If 'g' does not match the state, then
o Choose an operator 'o' whose add-list matches goal g
o Push 'o' on the OpStack
o Add the preconditions of 'o' to the goal set
4. While all preconditions of the operator on top of OpenStack are met in a state
o Pop operator o from top of opstack
o state = apply(o, state)
o plan = [plan; o]
UNIT 5 Natural Language processing
Natural Language Processing (NLP) refers to AI method of communicating with an intelligent systems using a
natural language such as English.
Processing of Natural Language is required when you want an intelligent system like robot to perform as per your
instructions, when you want to hear decision from a dialogue based clinical expert system, etc.
The field of NLP involves making computers to perform useful tasks with the natural languages humans use. The
input and output of an NLP system can be −
• Speech
• Written Text
Components of NLP : There are two components of NLP as given
Natural Language Understanding (NLU)
It is the process of producing meaningful phrases and sentences in the form of natural language from some
internal representation.
It involves −
• Text planning − It includes retrieving the relevant content from knowledge base.
• Sentence planning − It includes choosing required words, forming meaningful phrases, setting tone of the
sentence.
• Text Realization − It is mapping sentence plan into sentence structure.
• Lexical Analysis − It involves identifying and analyzing the structure of words. Lexicon of a language means the collection
of words and phrases in a language. Lexical analysis is dividing the whole chunk of txt into paragraphs, sentences, and
words.
• Syntactic Analysis (Parsing) − It involves analysis of words in the sentence for grammar and arranging words in a manner
that shows the relationship among the words. The sentence such as “The school goes to boy” is rejected by English
syntactic analyzer.
• Semantic Analysis − It draws the exact meaning or the dictionary meaning from the text. The text is checked for
meaningfulness. It is done by mapping syntactic structures and objects in the task domain. The semantic analyzer
disregards sentence such as “hot ice-cream”.
• Discourse Integration − The meaning of any sentence depends upon the meaning of the sentence just before it. In
addition, it also brings about the meaning of immediately succeeding sentence.
• Pragmatic Analysis − During this, what was said is re-interpreted on what it actually meant. It involves deriving those
aspects of language which require real world knowledge.
LEARNING
Learning element is the portion of a learning AI system that decides how to modify the
performance element and implements those modifications.
We all learn new knowledge through different methods, depending on the type of material to be
learned, the amount of relevant knowledge we already possess, and the environment in which the
learning takes place.
There are basically two methods for knowledge extraction firstly from domain experts and then with machine learning.
For a very large amount of data, the domain experts are not very useful and reliable. So we move towards the machine
learning approach for this work.
To use machine learning One method is to replicate the expert’s logic in the form of algorithms but this work is very tedious,
time taking, and expensive.
So we move towards the inductive algorithms which generate the strategy for performing a task and need not instruct
separately at each step.
• The need was due to the pitfalls which were present in the previous algorithms, one of the major pitfalls was the
lack of generalization of rules.
• The ID3 and AQ used the decision tree production method which was too specific which were difficult to analyze
and very slow to perform for basic short classification problems.
• The decision tree-based algorithm was unable to work for a new problem if some attributes are missing.
• The ILA uses the method of production of a general set of rules instead of decision trees, which overcomes the
above problems
1. List the examples in the form of a table ‘T’ where each row corresponds to an example and each column contains an
attribute value.
2. Create a set of m training examples, each example composed of k attributes and a class attribute with n possible
decisions.
3. Create a rule set, R, having the initial value false.
4. Initially, all rows in the table are unmarked.
learning Decision tree
Decision tree learning is a method of machine learning that creates a model of decisions based on data.
Decision tree learning is a supervised machine learning technique used to create models that can predict outcomes from data.
It works by breaking down the data into small chunks and creating a decision tree out of them, which can be used to make
predictions.
Decision tree algorithms are now being used in many areas such as medical diagnosis, credit scoring and natural language processing.
The beauty of decision tree learning lies in its simplicity; it provides an easy way to quickly arrive at complex conclusions without
requiring too much computing power or expertise.
Its ability to easily handle large amounts of data also makes it an attractive choice for big data analytics projects.
Decision tree-based learning is a popular machine learning technique that operates on a tree-like structure to effectively classify
objects.
The decision tree algorithm works by splitting the data set into subsets, using information gain or Gini impurity as the primary
criterion for successful splits.
There are two types of nodes in this process: root node and leaf node.
A root node represents the entire dataset, while a leaf node indicates a classification of an individual object within the dataset.
The goal of decision tree learning is to build models from training datasets that can predict classifications or values accurately
when presented with unseen test sets.
To do this, statistical learning methods are used to identify relevant features within decisions trees so they can be classified
correctly depending on certain criteria such as their probability distribution.
This allows for predictions regarding what type of classification should be assigned to new instances given these underlying
characteristics
Decision tree learning is an AI-based algorithm for data mining and machine learning.
It uses a decision tree classifier to create a model that can be used to classify, group, or predict future outcomes based on existing
data.
A decision tree consists of decisions made at various points in the learning process, which are then combined with other decisions
until reaching the final result.
Decision trees are often used as part of supervised learning algorithms like classification and regression trees (CART)
. As well as being used for classification tasks, they can also be used for clustering, finding patterns in datasets and forecasting
trends.
The main benefits of using decision tree based learning are that it is relatively easy to interpret what happens during training
because of its visual nature compared to many other types of learning algorithms;
this makes it easier to debug any problems that come up while creating a new model. Decision tree learning is a method of Artificial
Intelligence (AI) used to make decisions and predictions.
It uses a decision graph or set of logical rules that can be applied to data when making decisions about how best to act in certain
situations. Decision trees are first trained on a training set, which consists of labelled data points from a larger dataset. The
regression tree created by the process then determines the optimal decisions based on the attributes present in the given data
set.
The induction of decision trees involves algorithms for generating decision graphs as well as attribute selection measures that
determine the most effective way to split nodes within each branch of the tree.
Explanation-based learning in artificial intelligence is a branch of machine learning that focuses on creating algorithms that learn from
previously solved problems. It is a problem-solving method that is especially helpful when dealing with complicated, multi-faceted issues
that necessitate a thorough grasp of the underlying processes.
Since its beginning, machine learning has come a long way. While early machine-learning algorithms depended on statistical analysis to spot
patterns and forecast outcomes, contemporary machine-learning models are intended to learn from subject experts' explanations.
Explanation-based learning in artificial intelligence has proven to be a potent tool in its development that can handle complicated issues
more efficiently.
Explanation-based learning in artificial intelligence is a problem-solving method that involves agent learning by analyzing specific situations
and connecting them to previously acquired information. Also, the agent applies what he has learned to solve similar issues. Rather than
relying solely on statistical analysis, EBL algorithms incorporate logical reasoning and domain knowledge to make predictions and identify
patterns.
The genetic algorithm is search heuristic which is inspired by Darwin’s theory of natural evolution. It reflects the process of the
selection of the fittest element naturally.
A genetic algorithm starts with an initial population. From the initial population, this algorithm produces a new population using selection,
crossover, and mutation steps:
The algorithm takes the initial population as input and chooses a fitness function. The fitness function helps the algorithm to generate an
optimal or near-optimal solution. The algorithm continues and evolves the population through selection, crossover, and mutation operations.
It generates several populations until it satisfies the optimization constraints.
On the other hand, a neural network consists of a series of algorithms that endeavor to determine and identify patterns. It works similarly
to how a human brain’s neural network works. In algorithms, a neural network refers to a network of neurons, where a neuron is a
mathematical function used to collect as well as to classify data from a given model.
The inputs from the users form the input neuron layer in a neural network. The activation function layer determines the output. Depending
upon the problem, it can have more than one activation function layer. The summation layer sums up the output generated by the activation
function layer and then displays it in the output layer section.
Expert Systems: Representing and using domain knowledge, Expert system shells and knowledge acquisition.
Expert system = knowledge + problem-solving methods. ... A knowledge base that captures
the domain-specific knowledge and an inference engine that consists of algorithms for
manipulating the knowledge represented in the knowledge base to solve a problem presented to
the system.
Expert systems (ES) are one of the prominent research domains of AI. It is introduced by the
researchers at Stanford University, Computer Science Department.
Knowledge Base
It contains domain-specific and high-quality knowledge. Knowledge is required to exhibit
intelligence. The success of any ES majorly depends upon the collection of highly accurate and
precise knowledge.
What is Knowledge?
The data is collection of facts. The information is organized as data and facts about the task
domain. Data, information, and past experience combined together are termed as knowledge.
Components of Knowledge Base
The knowledge base of an ES is a store of both, factual and heuristic knowledge.
Knowledge representation
It is the method used to organize and formalize the knowledge in the knowledge base. It is in the
form of IF-THEN-ELSE rules.
Knowledge Acquisition
The success of any expert system majorly depends on the quality, completeness, and accuracy of
the information stored in the knowledge base.
The knowledge base is formed by readings from various experts, scholars, and the Knowledge
Engineers. The knowledge engineer is a person with the qualities of empathy, quick learning,
and case analyzing skills.
He acquires information from subject expert by recording, interviewing, and observing him at
work, etc.
He then categorizes and organizes the information in a meaningful way, in the form of
IF-THEN-ELSE rules, to be used by interference machine. The knowledge engineer also
monitors the development of the ES.
Inference Engine
Use of efficient procedures and rules by the Inference Engine is essential in deducting a correct,
flawless solution.
In case of knowledge-based ES, the Inference Engine acquires and manipulates the knowledge
from the knowledge base to arrive at a particular solution.
Applies rules repeatedly to the facts, which are obtained from earlier rule application.
Adds new knowledge into the knowledge base if required.
Resolves rules conflict when multiple rules are applicable to a particular case.
Forward Chaining
It is a strategy of an expert system to answer the question, “What can happen next?”
Here, the Inference Engine follows the chain of conditions and derivations and finally deduces the outcome. It considers all
the facts and rules, and sorts them before concluding to a solution.
This strategy is followed for working on conclusion, result, or effect. For example, prediction of share market status as an
effect of changes in interest rates.
Backward Chaining
With this strategy, an expert system finds out the answer to the question, “Why this happened?”
On the basis of what has already happened, the Inference Engine tries to find out which conditions could have happened in the
past for this result. This strategy is followed for finding out cause or reason. For example, diagnosis of blood cancer in humans.
User Interface
User interface provides interaction between user of the ES and the ES itself. It is generally Natural Language Processing so
as to be used by the user who is well-versed in the task domain. The user of the ES need not be necessarily an expert in
Artificial Intelligence.
It explains how the ES has arrived at a particular recommendation. The explanation may appear in the following forms −
The user interface makes it easy to trace the credibility of the deductions.
knowledge acquisition is the process of extracting knowledge from data.
This can be done manually, through a process of observation and experimentation, or automatically, using a variety of
techniques such as machine learning.
In artificial intelligence, knowledge acquisition is the process of gathering, selecting, and interpreting information and
experiences to create and maintain knowledge within a specific domain.
There are many different methods of knowledge acquisition, including rule-based systems, decision trees, artificial neural
networks, and fuzzy logic systems.
The most appropriate method for a given application depends on the nature of the problem and the type of data available.
There are a few methods of knowledge acquisition in AI:
1. Expert systems: In this method, experts in a particular field provide rules and knowledge to a computer system, which
can then be used to make decisions or solve problems in that domain.
2. Learning from examples: This is a common method used in machine learning, where a system is presented with a set of
training data, and it “learns” from these examples to generalize to new data.
3. Natural language processing: This is a method of extracting knowledge from text data, using techniques like text
mining and information extraction.
4. Semantic web: The semantic web is a way of representing knowledge on the internet using standards like RDF and
OWL, which can be processed by computers.
5. Knowledge representation and reasoning: This is a method of representing knowledge in a formal way, using logic or
other formalisms, which can then be used for automated reasoning
Role of knowledge acquisition in AI
In AI, knowledge acquisition is the process of acquiring knowledge from data sources and then using that knowledge to
improve the performance of AI systems. This process can be used to improve the accuracy of predictions made by AI
systems, or to help them learn new tasks faster.
One of the most important aspects of knowledge acquisition is choosing the right data sources. This is because the
quality of the data that AI systems use to learn is crucial to the performance of the system. For example, if an AI
system is trying to learn how to identify objects in images, it will need to be trained on a dataset of high-quality images.
Once the data has been collected, it needs to be processed and converted into a format that can be used by AI systems.
This process is known as feature engineering, and it is crucial to the success of AI systems. After the data has been
processed, it can be used to train AI models.
There are many different types of AI models, and each has its own strengths and weaknesses. The type of model that
is used will depend on the task that the AI system is trying to learn. For example, if the AI system is trying to learn how
to identify objects in images, a convolutional neural network (CNN) might be used.
Once the AI system has been trained, it can be deployed into a real-world environment. This is where knowledge
acquisition really comes into play. The AI system will need to be able to adapt to the new environment and learn from the
data that it encounters. This process is known as transfer learning, and it is essential for AI systems that need to
operate in the real world.