Introduction to Artificial Intelligence Concepts
Introduction to Artificial Intelligence Concepts
UNIT I
INTRODUCTIO
Page 1
lOMoAR cPSD| 31245499
This approach focuses on building artificial intelligence systems that can act like humans.
The goal is to create systems that can perform tasks such as recognizing speech, recognizing
images, and controlling robots in a human-like manner.
This approach is mainly used in computer vision and robotics, where the goal is to create
systems that can perceive and interact with the physical world in a human-like manner.
Examples:
Page 2
lOMoAR cPSD| 31245499
computer can be said to be an intelligent if it can mimic human response under specific
conditions. Turing Test was introduced by Turing in his 1950 paper, "Computing Machinery
and Intelligence," which considered the question, "Can Machine think?"
The Turing test is based on a party game "Imitation game," with some modifications.
This game involves three players in which one player is Computer, another player is human
responder, and the third player is a human Interrogator, who is isolated from other two
players and his job is to find that which player is machine among two of them.
The test result does not depend on each correct answer, but only how closely its
responses like a human answer. The computer is permitted to do everything possible to force
a wrong identification by the interrogator.
Player A (Computer): No
Player B (Human) : No
In this game, if an interrogator would not be able to identify which is a machine and
which is human, then the computer passes the test successfully, and the machine is said to be
intelligent and can think like a human.
Features required for a machine to pass the Turing test:
Natural language processing: NLP is required to communicate with Interrogator in
general human language like English.
Knowledge representation: To store and retrieve information during the test.
Automated reasoning: To use previously stored information to answer the questions.
Machine learning: To adapt new changes and can detect generalized patterns.
Vision: To recognize the interrogator actions and other objects during a test.
Examples:
a. Siri, Alexa, and Google Assistant: Virtual assistants that can understand and respond to
natural language input from users.
b. Chatbots: AI systems that can have conversations with humans using natural language
processing techniques.
c. Emotion recognition systems: AI systems that can detect emotions in human speech and
facial expressions.
Examples:
a. Autonomous agents: AI systems that can make decisions and take actions to achieve their
goals in an efficient and effective manner. Autonomous intelligence is artificial intelligence
(AI) that can act without human intervention, input, or direct supervision. It's considered the
most advanced type of artificial intelligence. Examples may include smart manufacturing
robots, self-driving cars, or care robots for the elderly.
c. Game AI: AI systems that can play games such as chess, Go, or poker and make decisions
based on the rules and objectives of the game.
1.3 IMPORTANCE AND PURPOSE OF ARTIFICIAL INTELLIGENCE
Before Learning about Artificial Intelligence, we should know that what is the importance
of AI and why should we learn it. Following are some main reasons to learn about AI:
With the help of AI, you can create such software or devices which can solve real-
world problems very easily and with accuracy such as health issues, marketing, traffic
issues, etc.
With the help of AI, you can create your personal virtual Assistant, such as Cortana,
Google Assistant, Siri, etc.
Page 5
lOMoAR cPSD| 31245499
With the help of AI, you can build such Robots which can work in an environment
where survival of humans can be at risk.
AI opens a path for other new technologies, new devices, and new Opportunities.
Proving a theorem
Playing chess
Plan some surgical operation
Driving a car in traffic
5. Creating some system which can exhibit intelligent behavior, learn new things by
itself, demonstrate, explain, and can advise to its user.
To create the AI first we should know that how intelligence is composed, so the
Intelligence is an intangible part of our brain which is a combination of Reasoning, learning,
problem-solving perception, language understanding, etc. To achieve the above factors
for a machine or software Artificial Intelligence requires the following discipline:
Page 6
lOMoAR cPSD| 31245499
Page 7
lOMoAR cPSD| 31245499
Page 8
lOMoAR cPSD| 31245499
1.3.5 Application of AI
Artificial Intelligence has various applications in today's society. It is becoming
essential for today's time because it can solve complex problems with an efficient way in
multiple industries, such as Healthcare, entertainment, finance, education, etc. AI is making
our daily life more comfortable and fast. Following are some sectors which have the
application of Artificial Intelligence:
AI in Astronomy
Artificial Intelligence can be very useful to solve complex universe problems. AI
technology can be helpful for understanding the universe such as how it works, origin,
etc.
AI in Healthcare
In the last, five to ten years, AI becoming more advantageous for the healthcare
industry and going to have a significant impact on this industry.
Healthcare Industries are applying AI to make a better and faster diagnosis than
humans. AI can help doctors with diagnoses and can inform when patients are
worsening so that medical help can reach to the patient before hospitalization.
AI in Gaming
AI can be used for gaming purpose. The AI machines can play strategic games like
chess, where the machine needs to think of a large number of possible places.
AI in Finance
AI and finance industries are the best matches for each other. The finance industry is
implementing automation, chatbot, adaptive intelligence, algorithm trading, and
machine learning into financial processes.
AI in Data Security
The security of data is crucial for every company and cyber-attacks are growing very
rapidly in the digital world. AI can be used to make your data more safe and secure.
Some examples such as AEG bot, AI2 Platform, are used to determine software bug
and cyber-attacks in a better way.
AI in Social Media
Social Media sites such as Facebook, Twitter, and Snap chat contain billions of user
profiles, which need to be stored and managed in a very efficient way. AI can
organize and manage massive amounts of data. AI can analyze lots of data to identify
the latest trends, hash tag, and requirement of different users.
Page 9
lOMoAR cPSD| 31245499
AI in Travel &Transport
AI is becoming highly demanding for travel industries. AI is capable of doing various
travel related works such as from making travel arrangement to suggesting the hotels,
flights, and best routes to the customers. Travel industries are using AI-powered
chatbots which can make human-like interaction with customers for better and fast
response.
AI in Automotive Industry
Some Automotive industries are using AI to provide virtual assistant to their user for
better performance. Such as Tesla has introduced TeslaBot, an intelligent virtual
assistant.
Various Industries are currently working for developing self-driven cars which can
make your journey more safe and secure.
AI in Robotics
Artificial Intelligence has a remarkable role in Robotics. Usually, general robots are
programmed such that they can perform some repetitive task, but with the help of AI,
we can create intelligent robots which can perform tasks with their own experiences
without pre-programmed.
Humanoid Robots are best examples for AI in robotics, recently the intelligent
Humanoid robot named as Erica and Sophia has been developed which can talk and
behave like humans.
AI in Entertainment
We are currently using some AI based applications in our daily life with some
entertainment services such as Netflix or Amazon. With the help of ML/AI
algorithms, these services show the recommendations for programs or shows.
AI in Agriculture
Agriculture is an area which requires various resources, labor, money, and time for
best result. Now a day's agriculture is becoming digital, and AI is emerging in this
field. Agriculture is applying AI as agriculture robotics, solid and crop monitoring,
predictive analysis. AI in agriculture can be very helpful for farmers.
AI in E-commerce
AI is providing a competitive edge to the e-commerce industry, and it is becoming
more demanding in the e-commerce business. AI is helping shoppers to discover
associated products with recommended size, color, or even brand.
Page
10
lOMoAR cPSD| 31245499
AI in education
AI can automate grading so that the tutor can have more time to teach. AI chatbot can
communicate with students as a teaching assistant.
AI in the future can be work as a personal virtual tutor for students, which will be
accessible easily at any time and any place.
1.4 HISTORY AND FUTURE OF ARTIFICIAL INTELLIGENCE
Artificial Intelligence is not a new word and not a new technology for researchers.
This technology is much older than you would imagine. Even there are the myths of
Mechanical men in Ancient Greek and Egyptian Myths. Following are some milestones in the
history of AI which defines the journey from the AI generation to till date development.
Page
11
lOMoAR cPSD| 31245499
Year 1943: The first work which is now recognized as AI was done by Warren
McCulloch and Walter pits in 1943. They proposed a model of artificial neurons.
Year 1949: Donald Hebb demonstrated an updating rule for modifying the
connection strength between neurons. His rule is now called Hebbian learning.
Year 1950: The Alan Turing who was an English mathematician and pioneered
Machine learning in 1950. Alan Turing publishes "Computing Machinery and
Intelligence" in which he proposed a test. The test can check the machine's ability to
exhibit intelligent behavior equivalent to human intelligence, called a Turing test.
Year 1955: An Allen Newell and Herbert A. Simon created the "first artificial
intelligence program"Which was named as "Logic Theorist". This program had
proved 38 of 52 Mathematics theorems, and find new and more elegant proofs for
some theorems.
Year 1956: The word "Artificial Intelligence" first adopted by American Computer
Page
12
lOMoAR cPSD| 31245499
Year 1966: The researchers emphasized developing algorithms which can solve
mathematical problems. Joseph Weizenbaum created the first chatbot in 1966, which
was named as ELIZA.
Year 1972: The first intelligent humanoid robot was built in Japan which was named
as WABOT-1.
The duration between years 1974 to 1980 was the first AI winter duration. AI winter
refers to the time period where computer scientist dealt with a severe shortage of
funding from government for AI researches.
During AI winters, an interest of publicity on artificial intelligence was decreased.
1.4.5 A boom of AI (1980-1987)
Year 1980: After AI winter duration, AI came back with "Expert System". Expert
systems were programmed that emulate the decision-making ability of a human
expert.
In the Year 1980, the first national conference of the American Association of
Artificial Intelligence was held at Stanford University.
The duration between the years 1987 to 1993 was the second AI Winter duration.
Again Investors and government stopped in funding for AI research as due to high
cost but not efficient result. The expert system such as XCON was very cost effective.
Year 1997: In the year 1997, IBM Deep Blue beats world chess champion, Gary
Kasparov, and became the first computer to beat a world chess
champion.
Page
13
lOMoAR cPSD| 31245499
Year 2002: for the first time, AI entered the home in the form of Roomba, a vacuum
cleaner.
Year 2006: AI came in the Business world till the year 2006. Companies like
Facebook, Twitter, and Netflix also started using AI.
1.4.8 .8 Deep learning, big data and artificial general intelligence (2011-present)
Year 2011: In the year 2011, IBM's Watson won jeopardy, a quiz show, where it had
to solve the complex questions as well as riddles. Watson had proved that it could
understand natural language and can solve tricky questions quickly.
Year 2012: Google has launched an Android app feature "Google now", which was
able to provide information to the user as a prediction.
Year 2014: In the year 2014, Chatbot "Eugene Goostman" won a competition in the
infamous "Turing test."
Year 2018: The "Project Debater" from IBM debated on complex topics with two
master debaters and also performed extremely well.
Google has demonstrated an AI program "Duplex" which was a virtual assistant and
which had taken hairdresser appointment on call, and lady on other side didn't notice
that she was talking with the machine.
Now AI has developed to a remarkable level. The concept of Deep learning, big data, and
data science are now trending like a boom. Nowadays companies like Google, Facebook,
IBM, and Amazon are working with AI and creating amazing devices. The future of Artificial
Intelligence is inspiring and will come with high intelligence.
Page
14
lOMoAR cPSD| 31245499
sewage to fighting fire and diffusing bombs, it?s we who get down, get our hands dirty and
risk our lives. The number of human lives we lose is also very high in these processes. In the
near future, we can expect machines or robots to take care of them. As artificial intelligence
evolves and smarter robots roll out, we can see them replacing humans at some of the riskiest
jobs in the world. That?s the only time we expect automation to take away jobs.
Personal Assistants:
Virtual assistants are already there and some of us would’ve used them. However, as the
technology grows, we can expect them to act as personal assistants and emote like humans.
With artificial intelligence, deep learning, and neural networks, it?s highly possible that we
can make robots emote and make them assistants. They could be used in tons of different
purposes such as in hospitality industry, day care centers, elder care, in clerical jobs and
more.
1.5 AGENT
An agent is anything that can viewed as perceiving its environment through sensors
and acting upon that environment through effectors. An Agent runs in the cycle of perceiving,
thinking, and acting those inputs and display output on the screen.
Hence the world around us is full of agents such as thermostat, cellphone, camera, and
even we are also agents.
Sensor: Sensor is a device which detects the change in the environment and sends the
information to other electronic devices. An agent observes its environment through sensors.
Actuators: Actuators are the component of machines that converts energy into
motion. The actuators are only responsible for moving and controlling a system. An actuator
can be an electric motor, gears, rails, etc.
Effectors: Effectors are the devices which affect the environment. Effectors can be
legs, wheels, arms, fingers, wings, fins, and display screen.
Page
15
lOMoAR cPSD| 31245499
An AI system can be defined as the study of the rational agent and its environment.
The agents sense the environment through sensors and act on their environment through
actuators. An AI agent can have mental properties such as knowledge, belief, intention, etc.
An agent can be:
Human-Agent: A human agent has eyes, ears, and other organs which work for sensors and
hand, legs, vocal tract work for actuators.
Robotic Agent: A robotic agent can have cameras, infrared range finder, NLP for sensors
and various motors for actuators.
Page
16
lOMoAR cPSD| 31245499
Types of AI Agents
Agents can be grouped into five classes based on their degree of perceived intelligence and capability. All
these agents can improve their performance and generate better action over the time. These are given
below:
o Simple Reflex Agent
o Goal-based agents
o Utility-based agent
o Learning agent
o The Simple reflex agent does not consider any part of percepts history during their decision and
action process.
o The Simple reflex agent works on Condition-action rule, which means it maps the current state to
action. Such as a Room Cleaner agent, it works only if there is dirt in the room.
o Problems for the simple reflex agent design approach:
The task of AI is to design an agent program which implements the agent function.
The structure of an intelligent agent is a combination of architecture and agent program. It can
be viewed as:
Agent = Architecture + Agent program
The following are the main three terms involved in the structure of an AI agent: Architecture:
Architecture is machinery that an AI agent executes on. Agent Function: Agent function is
F : P* → A
Page
18
lOMoAR cPSD| 31245499
A: Actuators
S: Sensors
Here performance measure is the objective for the success of an agent's behavior.
PEAS for self-driving cars:
Let’s suppose a self-driving car then PEAS representation will be:
Performance: Safety, time, legal drive, comfort
Environment: Roads, other vehicles, road signs, pedestrian
Actuators: Steering, accelerator, brake, signal, horn
Sensors: Camera, GPS, speedometer, odometer, accelerometer, sonar.
Performance
Agent Environment Actuators Sensors
measure
Patient Keyboard
1. Medical Patient Health Tests
Hospital (Entry of
Diagnose Disease diagnoses Treatments
Staff symptoms)
Page
19
lOMoAR cPSD| 31245499
Performance
Agent Environment Actuators Sensors
measure
Camera
Dirt detection
Room
Cleanness Wheels sensor
Table
2. Vacuum Efficiency Brushes Cliff sensor
Wood floor
Cleaner Battery life Vacuum Bump Sensor
Carpet
Extractor Infrared Wall
Various obstacles
Sensor
Camera
Percentage of Conveyor belt
3. Part - Jointed Arms Joint angle
parts in correct with parts,
picking Robot Hand sensors.
bins. Bins
Other types of problems include identifying patterns, predicting outcomes, and determining
solutions to systems of equations. Each type of problem has its own set of techniques and tools
that can be used to solve it.
1) understanding the problem: This step involves understanding the specifics of the problem and
figuring out what needs to be done to solve it.
2) generating possible solutions: This step involves coming up with as many possible solutions as
possible based on information about the problem and what you know about how computers work.
3) choosing a solution: This step involves deciding which solution is best based on what you know
about the problem and your options for solving it.
Problem-solving agents are a type of artificial intelligence that helps automate problem-solving.
They can be used to solve problems in natural language, algebra, calculus, statistics, and machine
learning.
There are three types of problem-solving agents: propositional, predicate, and automata.
Propositional problem-solving agents can understand simple statements like “draw a line between
A and B” or “find the maximum value of x.” Predicate problem-solving agents can understand
more complex statements like “find the shortest path between two points” or “find all pairs of
snakes in a jar.” Automata is the simplest form of problem-solving agent and can only understand
sequences of symbols like “draw a square.”
The problem solving agent follows this four phase problem solving process:
1. Goal Formulation: This is the first and most basic phase in problem solving. It arranges
specific steps to establish a target/goal that demands some activity to reach it. AI agents are
now used to formulate goals.
3. Search: After the Goal and Problem Formulation, the agent simulates sequences of actions
and has to look for a sequence of actions that reaches the goal. This process is
called search, and the sequence is called a solution. The agent might have to simulate
multiple sequences that do not reach the goal, but eventually, it will find a solution, or it
will find that no solution is possible. A search algorithm takes a problem as input and
outputs a sequence of actions.
4. Execution: After the search phase, the agent can now execute the actions that are
recommended by the search algorithm, one at a time. This final stage is known as the
execution phase.
Page
21
lOMoAR cPSD| 31245499
2. Actions
3. Transition Model
4. Goal Test
5. Path Cost
Initial State
It is the agent’s starting state or initial step towards its goal. For example, if a taxi agent needs to
travel to a location(B), but the taxi is already at location(A), the problem’s initial state would be
the location (A).
Actions
It is a description of the possible actions that the agent can take. Given a state s, Actions(s)
returns the actions that can be executed in s. Each of these actions is said to be appropriate in s.
It describes what each action does. It is specified by a function Result(s, a) that returns the state
that results from doing action an in state s.
The initial state, actions, and transition model together define the state space (A state space is a
set of all possible states that it can reach from the current state. The nodes of a state space
represent states, and the arcs connecting them represent actions. A path is a set of states and the
actions that link them in the state space.) of a problem, a set of all states reachable from the initial
state by any sequence of actions. The state space forms a graph in which the nodes are states, and
the links between the nodes are actions.
Goal Test
It determines if the given state is a goal state. Sometimes there is an explicit list of potential goal
states, and the test merely verifies whether the provided state is one of them. The goal is
sometimes expressed via an abstract attribute rather than an explicitly enumerated set of
conditions.
Path Cost
Page
22
lOMoAR cPSD| 31245499
It assigns a numerical cost to each path that leads to the goal. The problem solving agents choose
a cost function that matches its performance measure. Remember that the optimal solution has
the lowest path cost of all the solutions.
Example Problems
The problem-solving approach has been used in a wide range of work contexts. There are two
kinds of problem approaches
2. Real-world Problems: It is real-world problems that need solutions. It does not rely on
descriptions, unlike a toy problem, yet we can have a basic description of the issue.
8 Puzzle Problem
In a sliding-tile puzzle, a number of tiles (sometimes called blocks or pieces) are arranged in a
grid with one or more blank spaces so that some of the tiles can slide into the blank space. One variant
is the Rush Hour puzzle, in which cars and trucks slide around a 6 x 6 grid in an attempt to free a car
from the traffic jam. Perhaps the best-known variant is the 8- puzzle (see Figure below), which
consists of a 3 x 3 grid with eight numbered tiles and one blank space. The object is to reach a
specified goal state, such as the one shown on the right of the figure. The standard formulation of the
8 puzzles is as follows:
Let us take a vacuum cleaner agent and it can move left or right and its jump is to suck up the dirt
from the floor.
A grid world problem is a two-dimensional rectangular array of square cells through which agents
can move. Typically, the agent can go to any nearby cell that is clear of obstacles, either horizontally
or vertically, and in rare cases diagonally. A wall or other impassible obstruction in a cell prohibits an
agent from moving inside that cell.
The vacuum world’s problem can be stated as follows:
Searching is a process to find the solution for a given set of problems. This in artificial intelligence
can be done by using uninformed searching strategies of either informed searching strategies.
Uninformed searches, also known as blind searches, are search algorithms that explore a problem
space without using any specific knowledge or heuristics about the problem domain. They operate
in a brute force, meaning they try out every part of search space (The space of all feasible
solutions (the set of solutions among which the desired solution resides) is called search space )
Page
24
lOMoAR cPSD| 31245499
blindly and a brute force algorithm is a simple, comprehensive search strategy that systematically
explores every option until a problem’s answer is discovered.
Uninformed searches rely solely on the given problem definition and operate systematically to find
a solution. Examples of uninformed search algorithms include breadth-first search (BFS), depth-
first search (DFS),uniform-cost search (UCS), depth-limited search , and iterative deepening depth-
first search. Although all these examples work in a brute force way, they differ in the way they
traverse the nodes.
So far we have talked about the uninformed search algorithms which looked through search space
for all possible solutions of the problem without having any additional knowledge about search
space. But informed search algorithm contains an array of knowledge such as how far we are from
the goal, path cost, how to reach to goal node, etc. This knowledge help agents to explore less to the
search space and find more efficiently the goal node.
Page
25
lOMoAR cPSD| 31245499
What is Heuristics?
A heuristic is a technique that is used to solve a problem faster than the classic methods. These
techniques are used to find the approximate solution of a problem when classical methods do not.
Heuristics are said to be the problem-solving techniques that result in practical and quick solutions.
Psychologists Daniel Kahneman and Amos Tversky have developed the study of Heuristics in
human decision-making in the 1970s and 1980s. However, this concept was first introduced by
the Nobel Laureate Herbert A. Simon, whose primary object of research was problem-solving.
Heuristics are used in situations in which there is the requirement of a short-term solution. On
facing complex situations with limited resources and time, Heuristics can help the companies to
make quick decisions by shortcuts and approximated calculations. Most of the heuristic methods
involve mental shortcuts to make decisions on past experiences.
The heuristic method might not always provide us the finest solution, but it is assured that it helps
us find a good solution in a reasonable time.
The best first search uses the concept of a priority queue and heuristic search. It is a search
algorithm that works on a specific rule. The aim is to reach the goal from the initial state via the
shortest path. The best First Search algorithm in artificial intelligence is used for finding the
shortest path from a given starting node to a goal node in a graph. The algorithm works by
Page
26
lOMoAR cPSD| 31245499
expanding the nodes of the graph in order to increase the distance from the starting node until the
goal node is reached.
BFS uses the concept of a Priority queue and heuristic search. To search the graph space, the BFS
method uses two lists for tracking the traversal. An ‘Open’ list that keeps track of the current
‘immediate’ nodes available for traversal and a ‘CLOSED’ list that keeps track of the nodes already
traversed.
Page
27
lOMoAR cPSD| 31245499
A * algorithm is a searching algorithm that searches for the shortest path between the initial and the final
state. It is used in various applications, such as maps.
In maps the A* algorithm is used to calculate the shortest distance between the source (initial state) and the
destination (final state).
How it works
Imagine a square grid which possesses many obstacles, scattered randomly. The initial and the final cell is
provided. The aim is to reach the final cell in the shortest amount of time.
Explanation
The core of the A* algorithm is based on cost functions and heuristics. It uses two main parameters:
g : the cost of moving from the initial cell to the current cell. Basically, it is the sum of all the cells
that have been visited since leaving the first cell.
h : also known as the heuristic value, it is the estimated cost of moving from the current cell to the
final cell. The actual cost cannot be calculated until the final cell is reached. Hence, h is the
estimated cost. We must make sure that there is never an over estimation of the cost.
The way that the algorithm makes its decisions is by taking the f-value into account. The algorithm selects
the smallest f-valued cell and moves to that cell. This process continues until the algorithm reaches its goal cell.
Page
28
lOMoAR cPSD| 31245499
Breadth-first Search
Page
29
lOMoAR cPSD| 31245499
Breadth-first search (BFS) is one of the most important uninformed search strategies in artificial
intelligence to explore a search space systematically. BFS explores all the neighboring nodes of the initial
state before moving on to explore their neighbours. This strategy ensures that the shortest path to the goal
is found.
The algorithm works by starting at the initial state and adding all its neighbors to a queue. It then dequeues
the first node in the queue, adds neighbors to the end of the queue, and repeats the process until the goal
state is found or the queue is empty.
BFS explores all the nodes at a given distance (or level) from the starting node before moving on to explore
the nodes at the next distance (or level) from the starting node. This means that BFS visits all the nodes that
are closest to the starting node before moving on to nodes that are farther away.
Advantages
Breadth-first search (BFS) is an algorithm used in artificial intelligence to explore a search space
systematically. Some advantages of BFS include the following:
Page
30
lOMoAR cPSD| 31245499
Completeness:
BFS is guaranteed to find the goal state if it exists in the search space, provided the branching factor
is finite.
Optimal solution:
BFS is guaranteed to find the shortest path to the goal state, as it explores all nodes at the same
depth before moving on to nodes at a deeper level.
Simplicity:
BFS is easy to understand and implement, making it a good baseline algorithm for more complex
search algorithms.
No redundant paths:
BFS does not explore redundant paths because it explores all nodes at the same depth before
moving on to deeper levels.
Disadvantages
Memory-intensive:
BFS can be memory-intensive for large search spaces because it stores all the nodes at each level in
the queue.
Time-intensive:
BFS can be time-intensive for search spaces with a high branching factor because it needs to
explore many nodes before finding the goal state.
Inefficient for deep search spaces:
BFS can be inefficient for search spaces with a deep depth because it needs to explore all nodes at
each depth before moving on to the next level.
The time and space complexity of breadth-first search (BFS) in artificial intelligence can vary depending
on the size and structure of the search space.
Time complexity:
The time complexity of BFS is proportional to the number of nodes in the search space, as BFS explores all
nodes at each level before moving on to deeper levels. For example, if the goal state is at the deepest level,
BFS must explore all nodes in the search space, resulting in a time complexity of bd, where b is the
branching factor, and d is the depth of the search space.
Space complexity:
The space complexity of BFS is proportional to the maximum number of nodes stored in the queue during
the search. For example, if the search space is a tree, the maximum number of nodes stored in the queue at
any given time is the number of nodes at the deepest level, which is proportional to bd. Therefore, the
space complexity of BFS is O(bd).
Example
Suppose we have a search space with an initial state "A" and a goal state "E" connected by nodes as
follows:
A
/ \
B C
/ \
D E
Page
31
lOMoAR cPSD| 31245499
To perform BFS on this search space, we start by adding the initial state "A" to a queue:
Queue: A
Explored: {}
We dequeue the first node in the queue, which is "A", and add its children "B" and "C" to the end of the
queue:
Queue: B, C
Explored: {A}
We then dequeue "B" and "C" and add their children to the end of the queue:
Queue: C, D
Explored: {A, B}
We dequeue "C" and add its child "E" to the end of the queue:
Queue: D, E
Explored: {A, B, C}
Finally, we dequeue "D" and "E" and find that "E" is the goal state, so we have successfully found a path
from "A" to "E" using BFS.
Depth-first Search
Depth-first search (DFS) is popular among the uninformed search strategies in artificial intelligence to
explore and traverse a graph or tree data structure. The algorithm starts at a given node in the graph and
explores as far as possible along each branch before backtracking.
Page
32
lOMoAR cPSD| 31245499
DFS has several applications in AI, including pathfinding, searching for solutions to a problem, and
exploring the state space of a problem. It is particularly useful when the solution is far from the starting
node because it can explore the graph deeply before exploring widely.
Advantages
Memory efficiency:
DFS uses less memory than breadth-first search because it only needs to keep track of a single path
at a time.
Finds a solution quickly:
If the solution to a problem is located deep in a tree, DFS can quickly reach it by exploring one path
until it reaches the solution.
Easy to implement:
DFS is a simple algorithm to understand and implement, especially when using recursion.
Can be used for certain types of problems:
DFS is particularly useful for problems that involve searching for a path, such as maze-solving or
finding the shortest path between two nodes in a graph.
Disadvantages
Example
Traversing a binary tree
Page
33
lOMoAR cPSD| 31245499
To traverse this tree using DFS, we start at the root node (1) and explore as far as possible along each
branch before backtracking. Here is the order in which the nodes would be visited.
We first visit the root node (1), then the left child (2), and so on. Once we reach a leaf node (4), we
backtrack to the last node with an unexplored child (2) and continue exploring its other child (5). We
continue this process until all nodes have been visited.
Uniform-cost search is a searching algorithm used for traversing a weighted tree or graph. This
algorithm comes into play when a different cost is available for each edge. The primary goal of the
uniform-cost search is to find a path to the goal node which has the lowest cumulative cost.
Uniform-cost search expands nodes according to their path costs form the root node. It can be used
to solve any graph/tree where the optimal cost is in demand. A uniform-cost search algorithm is
implemented by the priority queue. It gives maximum priority to the lowest cumulative cost.
Uniform cost search is equivalent to BFS algorithm if the path cost of all edges is the same.
Page
34
lOMoAR cPSD| 31245499
Input: Let the graph be as below with source node being A and destination E.
Output:
Advantages:
o Uniform cost search is optimal because at every state the path with the least cost is chosen.
Disadvantages:
o It does not care about the number of steps involve in searching and only concerned about path cost.
Due to which this algorithm may be stuck in an infinite loop.
Page
35
lOMoAR cPSD| 31245499
Depth limited search is the new search algorithm for uninformed search. The unbounded tree
problem happens to appear in the depth-first search algorithm, and it can be fixed by imposing a
boundary or a limit to the depth of the search domain. We will say that this limit as the depth limit,
making the DFS search strategy more refined and organized into a finite loop. We denote this limit
by l, and thus this provides the solution to the infinite path problem that originated earlier in the
DFS algorithm. Thus, Depth limited search can be called an extended and refined version of the
DFS algorithm. In a nutshell, we can say that to avoid the infinite loop status while executing the
codes, and depth limited search algorithm is being executed into a finite set of depth called depth
limit.
Algorithm
This algorithm essentially follows a similar set of steps as in the DFS algorithm.
2. Then it is marked as visited, and if node 1 is not the goal node in the search,
3. Next, we mark it as visited and check if node 2 is the goal node or not.
4. If node 2 is not found to be the goal node, then we push node 4 on top of the
stack.
5. Now we search in the same depth limit and move along depth-wise to check for
6. If Node 4 is also not found to be the goal node and depth limit is found to be
unexplored.
7. Then we push them into the stack and mark them visited.
8. We continue to perform these steps in iterative ways unless the goal node is
reached or until all nodes within depth limit have been explored for the goal.
Page
36
lOMoAR cPSD| 31245499
Page
37
lOMoAR cPSD| 31245499
Machine learning models are increasingly used in various applications to classify data into different
categories. However, evaluating the performance of these models is crucial to ensure their accuracy and
reliability. One essential tool in this evaluation process is the confusion matrix.
The confusion matrix is a matrix used to determine the performance of the classification models for a given
set of test data. It can only be determined if the true values for test data are known. Since it shows the
errors in the model performance in the form of a matrix, hence also known as an error matrix. Some
features of Confusion matrix are given below:
o For the 2 prediction classes of classifiers, the matrix is of 2*2 table, for 3 classes, it is 3*3 table, and
so on.
o The matrix is divided into two dimensions, that are predicted values and actual values along with
the total number of predictions.
o Predicted values are those values, which are predicted by the model, and actual values are the true
values for the given observations.
o True Negative: Model has given prediction No, and the real or actual value was also No.
o True Positive: The model has predicted yes, and the actual value was also true.
o False Negative: The model has predicted no, but the actual value was Yes, it is also called as Type-
II error.
o False Positive: The model has predicted Yes, but the actual value was No. It is also called a Type-I
error.
Page
38
lOMoAR cPSD| 31245499
o With the help of the confusion matrix, we can calculate the different parameters for the model, such
as accuracy, precision, etc.
Suppose we are trying to create a model that can predict the result for the disease that is either a person has
that disease or not. So, the confusion matrix for this is given as:
o The table is given for the two-class classifier, which has two predictions "Yes" and "NO." Here,
Yes defines that patient has the disease, and No defines that patient does not has that disease.
o The classifier has made a total of 100 predictions. Out of 100 predictions, 89 are true predictions,
and 11 are incorrect predictions.
o The model has given prediction "yes" for 32 times, and "No" for 68 times. Whereas the actual
"Yes" was 27, and actual "No" was 73 times.
We can perform various calculations for the model, such as the model's accuracy, using this matrix.
These calculations are given below:
o Classification Accuracy: It is one of the important parameters to determine the accuracy of the
classification problems. It defines how often the model predicts the correct output. It can be
calculated as the ratio of the number of correct predictions made by the classifier to all number of
predictions made by the classifiers. The formula is given below:
Page
39
lOMoAR cPSD| 31245499
o Misclassification rate: It is also termed as Error rate, and it defines how often the model gives the
wrong predictions. The value of error rate can be calculated as the number of incorrect predictions
to all number of the predictions made by the classifier. The formula is given below:
o Precision: Precision is a metric used to measure how well a machine learning model predicts
positive classes. It can be defined as the number of correct outputs provided by the model or out of
all positive classes that have predicted correctly by the model, how many of them were actually
true. It can be calculated using the below formula:
o Recall: measures the effectiveness of a classification model in identifying all relevant instances
from a dataset, or Recall is a metric that measures how often a machine learning model correctly
identifies positive instances (true positives) from all the actual positive samples in the dataset. It is
the ratio of the number of true positive (TP) instances to the sum of true positive and false negative
(FN) instances.
o F-measure: If two models have low precision and high recall or vice versa, it is difficult to
compare these models. So, for this purpose, we can use F-score. This score helps us to evaluate the
recall and precision at the same time. The F-score is maximum if the recall is equal to the precision.
It can be calculated using the below formula:
Page
40
lOMoAR cPSD| 31245499
Machine Learning
A rapidly developing field of technology, machine learning allows computers to
automatically learn from previous data. For building mathematical models and making
predictions based on historical data or information, machine learning employs a variety
of algorithms. It is currently being used for a variety of tasks, including speech recognition,
email filtering, auto-tagging on Facebook, a recommender system, and image
recognition.
Page
41
lOMoAR cPSD| 31245499
then
like as traditional rule-
used to make
Machine Learning and based
predictions on
AI. programming.
new data.
Sometimes AI uses a
combination of both
Traditional
ML can find patterns Data
programming
and and Pre-defined rules,
is totally dependent on
insights in large which
the
datasets gives it a great edge in
intelligence of
that might be difficult solving complex tasks
developers.
for with
So, it has very limited
humans to discover. good accuracy which
capability.
seem
impossible to humans.
Machine Learning lifecycle:
The lifecycle of a machine learning project involves a series of steps that include:
1. Study the Problems: The first step is to study the problem. This step involves
understanding the business problem and defining the objectives of the model.
2. Data Collection: When the problem is well-defined, we can collect the relevant data
required for the model. The data could come from various sources such as databases,
APIs, or web scraping.
3. Data Preparation: When our problem-related data is collected. then it is a good idea
to check the data properly and make it in the desired format so that it can be used by
the model to find the hidden patterns. This can be done in the following steps:
Data cleaning
Data Transformation
Explanatory Data Analysis and Feature Engineering Split the dataset for training and testing.
4. Model Selection: The next step is to select the appropriate machine learning
algorithm that is suitable for our problem. This step requires knowledge of the strengths
and weaknesses of different algorithms. Sometimes we use multiple models and
compare their results and select the best model as per our requirements.
5. Model building and Training: After selecting the algorithm, we have to build the
model.
Page
43
lOMoAR cPSD| 31245499
6. Model Evaluation: Once the model is trained, it can be evaluated on the test dataset
to determine its accuracy and performance using different techniques like classification
report, F1 score, precision, recall, ROC Curve, Mean Square error, absolute error, etc.
7. Model Tuning: Based on the evaluation results, the model may need to be tuned or
optimized to improve its performance. This involves tweaking the hyper parameters of
the model.
8. Deployment: Once the model is trained and tuned, it can be deployed in a production
environment to make predictions on new data. This step requires integrating the model
into an existing software system or creating a new system for the model.
9. Monitoring and Maintenance: Finally, it is essential to monitor the model’s
performance in the production environment and perform maintenance tasks as required.
This involves monitoring for data drift, retraining the model as needed, and updating
the model as new data becomes available.
Now in this Machine learning tutorial, let’s learn the applications of Machine Learning:
• Automation: Machine learning, which works entirely autonomously in any field
without the need for any human intervention. For example, robots perform the
essential process steps in manufacturing plants.
• Healthcare industry: Healthcare was one of the first industries to use machine
learning with image detection.
Page
44
lOMoAR cPSD| 31245499
1. Supervised learning
2. Unsupervised learning
3. Reinforcement learning
In supervised learning, the training data provided to the machines work as the supervisor that teaches the
machines to predict the output correctly. It applies the same concept as a student learns in the supervision
of the teacher.
Supervised learning is a process of providing input data as well as correct output data to the machine
learning model. The aim of a supervised learning algorithm is to find a mapping function to map the
input variable(x) with the output variable(y).
Page
45
lOMoAR cPSD| 31245499
In the real-world, supervised learning can be used for Risk Assessment, Image classification, Fraud
Detection, spam filtering, etc.
The working of Supervised learning can be easily understood by the below example and diagram:
Suppose we have a dataset of different types of shapes which includes square, rectangle, triangle, and
Polygon. Now the first step is that we need to train the model for each shape.
o If the given shape has four sides, and all the sides are equal, then it will be labelled as a Square.
o If the given shape has three sides, then it will be labelled as a triangle.
o If the given shape has six equal sides then it will be labelled as hexagon.
Now, after training, we test our model using the test set, and the task of the model is to identify the shape.
The machine is already trained on all types of shapes, and when it finds a new shape, it classifies the shape
on the bases of a number of sides, and predicts the output.
o Split the training dataset into training dataset, test dataset, and validation dataset.
o Determine the input features of the training dataset, which should have enough knowledge so that
the model can accurately predict the output.
o Determine the suitable algorithm for the model, such as support vector machine, decision tree, etc.
o Execute the algorithm on the training dataset. Sometimes we need validation sets as the control
parameters, which are the subset of training datasets.
o Evaluate the accuracy of the model by providing the test set. If the model predicts the correct
output, which means our model is accurate.
1. Regression
Regression algorithms are used if there is a relationship between the input variable and the output variable.
It is used for the prediction of continuous variables, such as Weather forecasting, Market Trends, etc.
Below are some popular Regression algorithms which come under supervised learning:
o Linear Regression
o Regression Trees
o Non-Linear Regression
o Polynomial Regression
2. Classification
Page
47
lOMoAR cPSD| 31245499
Classification algorithms are used when the output variable is categorical, which means there are two
classes such as Yes-No, Male-Female, True-false, etc.
Spam Filtering,
o Random Forest
o Decision Trees
o Logistic Regression
o With the help of supervised learning, the model can predict the output on the basis of prior
experiences.
o In supervised learning, we can have an exact idea about the classes of objects.
o Supervised learning model helps us to solve various real-world problems such as fraud detection,
spam filtering, etc.
o Supervised learning models are not suitable for handling the complex tasks.
o Supervised learning cannot predict the correct output if the test data is different from the training
dataset.
o Training required lots of computation times.
In the previous topic, we learned supervised machine learning in which models are trained using labeled
data under the supervision of training data. But there may be many cases in which we do not have labeled
data and need to find the hidden patterns from the given dataset. So, to solve such types of cases in
machine learning, we need unsupervised learning techniques.
Page
48
lOMoAR cPSD| 31245499
As the name suggests, unsupervised learning is a machine learning technique in which models are not
supervised using training dataset. Instead, models itself find the hidden patterns and insights from the given
data. It can be compared to learning which takes place in the human brain while learning new things. It can
be defined as:
Unsupervised learning is a type of machine learning in which models are trained using unlabeled dataset
and are allowed to act on that data without any supervision.
Unsupervised learning cannot be directly applied to a regression or classification problem because unlike
supervised learning, we have the input data but no corresponding output data. The goal of unsupervised
learning is to find the underlying structure of dataset, group that data according to similarities, and
represent that dataset in a compressed format.
Example: Suppose the unsupervised learning algorithm is given an input dataset containing images of
different types of cats and dogs. The algorithm is never trained upon the given dataset, which means it does
not have any idea about the features of the dataset. The task of the unsupervised learning algorithm is to
identify the image features on their own. Unsupervised learning algorithm will perform this task by
clustering the image dataset into the groups according to similarities between images.
o Unsupervised learning is helpful for finding useful insights from the data.
o Unsupervised learning is much similar as a human learns to think by their own experiences, which
makes it closer to the real AI.
Page
49
lOMoAR cPSD| 31245499
o Unsupervised learning works on unlabeled and uncategorized data which make unsupervised
learning more important.
o In real-world, we do not always have input data with the corresponding output so to solve such
cases, we need unsupervised learning.
Here, we have taken an unlabeled input data, which means it is not categorized and corresponding outputs
are also not given. Now, this unlabeled input data is fed to the machine learning model in order to train it.
Firstly, it will interpret the raw data to find the hidden patterns from the data and then will apply suitable
algorithms such as k-means clustering, Decision tree, etc.
Once it applies the suitable algorithm, the algorithm divides the data objects into groups according to the
similarities and difference between the objects.
The unsupervised learning algorithm can be further categorized into two types of problems:
Page
50
lOMoAR cPSD| 31245499
o Clustering: Clustering is a method of grouping the objects into clusters such that objects with most
similarities remains into a group and has less or no similarities with the objects of another group.
Cluster analysis finds the commonalities between the data objects and categorizes them as per the
presence and absence of those commonalities.
o Association: An association rule is an unsupervised learning method which is used for finding the
relationships between variables in the large database. It determines the set of items that occurs
together in the dataset. Association rule makes marketing strategy more effective. Such as people
who buy X item (suppose a bread) are also tend to purchase Y (Butter/Jam) item. A typical example
of Association rule is Market Basket Analysis.
o K-means clustering
o Hierarchal clustering
o Anomaly detection
o Neural Networks
Page
51
lOMoAR cPSD| 31245499
o Apriori algorithm
Clustering or cluster analysis is a machine learning technique, which groups the unlabelled dataset. It can
be defined as "A way of grouping the data points into different clusters, consisting of similar data
points. The objects with the possible similarities remain in a group that has less or no similarities with
another group."
It does it by finding some similar patterns in the unlabelled dataset such as shape, size, color, behavior, etc.,
and divides them as per the presence and absence of those similar patterns.
It is an unsupervised learning method, hence no supervision is provided to the algorithm, and it deals with
the unlabeled dataset.
After applying this clustering technique, each cluster or group is provided with a cluster-ID. ML system
can use this id to simplify the processing of large and complex datasets.
Note: Clustering is somewhere similar to the classification algorithm, but the difference is the type of
dataset that we are using. In classification, we work with the labeled data set, whereas in clustering, we
work with the unlabelled dataset.
Example: Let's understand the clustering technique with the real-world example of Mall: When we visit
any shopping mall, we can observe that the things with similar usage are grouped together. Such as the t-
shirts are grouped in one section, and trousers are at other sections, similarly, at vegetable sections, apples,
bananas, Mangoes, etc., are grouped in separate sections, so that we can easily find out the things. The
clustering technique also works in the same way. Other examples of clustering are grouping documents
according to the topic.
The clustering technique can be widely used in various tasks. Some most common uses of this technique
are:
o Market Segmentation
o Image segmentation
Apart from these general usages, it is used by the Amazon in its recommendation system to provide the
recommendations as per the past search of products. Netflix also uses this technique to recommend the
movies and web-series to its users as per the watch history.
The below diagram explains the working of the clustering algorithm. We can see the different fruits are
divided into several groups with similar properties.
Applications of Clustering
Below are some commonly known applications of clustering technique in Machine Learning:
Page
53
lOMoAR cPSD| 31245499
o In Identification of Cancer Cells: The clustering algorithms are widely used for the identification
of cancerous cells. It divides the cancerous and non-cancerous data sets into different groups.
o In Search Engines: Search engines also work on the clustering technique. The search result
appears based on the closest object to the search query. It does it by grouping similar data objects in
one group that is far from the other dissimilar objects. The accurate result of a query depends on the
quality of the clustering algorithm used.
o Customer Segmentation: It is used in market research to segment the customers based on their
choice and preferences.
o In Biology: It is used in the biology stream to classify different species of plants and animals using
the image recognition technique.
The clustering methods are broadly divided into Hard clustering (datapoint belongs to only one group)
and Soft Clustering (data points can belong to another group also). But there are also other various
approaches of Clustering exist. Below are the main clustering methods used in Machine learning:
1. Partitioning Clustering
2. Density-Based Clustering
3. Distribution Model-Based Clustering
4. Hierarchical Clustering
5. Fuzzy Clustering
Partitioning Clustering
It is a type of clustering that divides the data into non-hierarchical groups. It is also known as the centroid-
based method. The most common example of partitioning clustering is the K-Means Clustering
algorithm.
In this type, the dataset is divided into a set of k groups, where K is used to define the number of pre-
defined groups. The cluster center is created in such a way that the distance between the data points of one
cluster is minimum as compared to another cluster centroid.
Page
54
lOMoAR cPSD| 31245499
K-Means Clustering is an unsupervised learning algorithm that is used to solve clustering problems in
machine learning or data science. In this topic, we will learn what is K-means clustering algorithm, how the
algorithm works, along with the Python implementation of k-means clustering.
K-Means Clustering is an Unsupervised Learning algorithm, which groups the unlabeled dataset into
different clusters. Here K defines the number of pre-defined clusters that need to be created in the process,
as if K=2, there will be two clusters, and for K=3, there will be three clusters, and so on.
It is an iterative algorithm that divides the unlabeled dataset into k different clusters in such a way that each
dataset belongs to only one group that has similar properties.
It allows us to cluster the data into different groups and is a convenient way to discover the categories of
groups in the unlabeled dataset on its own without the need for any training.
It is a centroid-based algorithm, where each cluster is associated with a centroid. The main aim of this
algorithm is to minimize the sum of distances between the data point and their corresponding clusters.
Page
55
lOMoAR cPSD| 31245499
The algorithm takes the unlabeled dataset as input, divides the dataset into k-number of clusters, and
repeats the process until it does not find the best clusters. The value of k should be predetermined in this
algorithm.
o Determines the best value for K center points or centroids by an iterative process.
o Assigns each data point to its closest k-center. Those data points which are near to the particular k-
center, create a cluster.
Hence each cluster has datapoints with some commonalities, and it is away from other clusters.
The below diagram explains the working of the K-means Clustering Algorithm:
Step-2: Select random K points or centroids. (It can be other from the input dataset).
Step-3: Assign each data point to their closest centroid, which will form the predefined K clusters.
Step-4: Calculate the variance and place a new centroid of each cluster.
Page
56
lOMoAR cPSD| 31245499
Step-5: Repeat the third steps, which means reassign each datapoint to the new closest centroid of each
cluster.
Before implementation, let's understand what type of problem we will solve here. So, we have a dataset
of Mall_Customers, which is the data of customers who visit the mall and spend there.
In the given dataset, we have Customer_Id, Gender, Age, Annual Income ($), and Spending
Score (which is the calculated value of how much a customer has spent in the mall, the more the value, the
more he has spent). From this dataset, we need to calculate some patterns, as it is an unsupervised method,
so we don't know what to calculate exactly.
Data pre-processing
# importing libraries
import numpy as nm
import pandas as pd
Page
57
lOMoAR cPSD| 31245499
dataset = pd.read_csv('Mall_Customers_data.csv')
x = dataset.iloc[:, [3, 4]].values // As we can see, we are extracting only 3rd and 4th feature. It is
because we need a 2d plot to visualize the model, and some features are not required, such as customer_id.
2. Find the optimal number of clusters using the elbow method. Here’s the
code you use:
Page
58
lOMoAR cPSD| 31245499
kmeans.fit(x)
wcss_list.append(kmeans.inertia_)
mtp.xlabel('Number of clusters(k)')
mtp.ylabel('wcss_list')
mtp.show()
As we can see in the above code, we have used the KMeans class of sklearn. cluster library to form the
clusters.
Next, we have created the wcss_list variable to initialize an empty list, which is used to contain the value
of wcss computed for different values of k ranging from 1 to 10.
After that, we have initialized the for loop for the iteration on a different value of k ranging from 1 to 10;
since for loop in Python, exclude the outbound limit, so it is taken as 11 to include 10th value.
The rest part of the code is similar as we did in earlier topics, as we have fitted the model on a matrix of
features and then plotted the graph between the number of clusters and WCSS.
Page
59
lOMoAR cPSD| 31245499
Output: After executing the above code, we will get the below output:
From the above plot, we can see the elbow point is at 5. So the number of clusters here will be 5.
3. Train the K-means algorithm on the training dataset. However, instead of using i, use 5, because
there are 5 clusters that need to be formed. Here’s the code:
y_predict= kmeans.fit_predict(x)
The first line is the same as above for creating the object of KMeans class. In the second line of code, we
have created the dependent variable y_predict to train the model. By executing the above lines of code, we
will get the y_predict variable.
4. Visualize the Clusters. Since this model has five clusters, we need to
visualize each one.
Page
60
lOMoAR cPSD| 31245499
mtp.title('Clusters of customers')
mtp.legend()
mtp.show()
Output:
Page
61
lOMoAR cPSD| 31245499
The output image clearly shows the five different clusters with different colors. The clusters are formed
between two parameters of the dataset: Annual income of customer and Spending. We can change the
colors and labels as per the requirement or choice. We can also observe some points from the above
patterns, which are given below:
o Cluster1 shows the customers with average salary and average spending so we can categorize these
customers as
o Cluster2 shows the customer has a high income but low spending, so we can categorize them
as careful.
o Cluster3 shows the low income and also low spending so they can be categorized as sensible.
o Cluster4 shows the customers with low income with very high spending so they can be categorized
as careless.
o Cluster5 shows the customers with high income and high spending so they can be categorized as
target, and these customers can be the most profitable customers for the mall owner.
o Unsupervised learning is used for more complex tasks as compared to supervised learning because,
in unsupervised learning, we don't have labeled input data.
o Unsupervised learning is preferable as it is easy to get unlabeled data in comparison to labeled data.
Page
62
lOMoAR cPSD| 31245499
o Unsupervised learning is intrinsically more difficult than supervised learning as it does not have
corresponding output.
o The result of the unsupervised learning algorithm might be less accurate as input data is not labeled,
and algorithms do not know the exact output in advance.
3) Reinforcement Learning
Reinforcement learning is a feedback-based learning method, in which a learning agent gets a reward for
each right action and gets a penalty for each wrong action. The agent learns automatically with these
feedbacks and improves its performance. In reinforcement learning, the agent interacts with the
environment and explores it. The goal of an agent is to get the most reward points, and hence, it improves
its
performance.
• Model-based reinforcement learning: as it sounds, has an agent trying to understand its environment
and creating a model for it based on its interactions with this environment. In such a system, preferences
take priority over the consequences of the actions i.e. the greedy agent will always try to perform an action
that will get the maximum reward irrespective of what that action may cause.
• Model-free reinforcement learning: On the other hand, model-free algorithms seek to learn the
consequences of their actions through experience via algorithms such as Policy Gradient, Q-Learning, etc.
In other words, such an algorithm will carry out an action multiple times and will adjust the policy (the
strategy behind its actions) for optimal rewards, based on the outcomes.
Some popular model-free reinforcement learning algorithms include Q-Learning, SARSA, and Deep
Reinforcement Learning.
Scikit-learn is an open-source library in Python that helps us implement machine learning models. This
library provides a collection of handy tools like regression and classification to simplify complex machine
learning problems
Page
63
lOMoAR cPSD| 31245499
Code Explanation:
This code imports necessary libraries for building a decision tree classifier model.
pandas is a library used for data manipulation and analysis.
DecisionTreeClassifier is a class from the sklearn.tree module that is used to build a
decision tree classifier model.
train_test_split is a function from the sklearn.model_selection module that is used to
split the dataset into training and testing sets.
metrics is a module from the sklearn library that provides various metrics for evaluating
the performance of a machine learning model.
By importing these libraries, the user can use their functions and classes to build and evaluate a
decision tree classifier model.
Loading Data
Let's first load the required Pima Indian Diabetes dataset using pandas' read CSV function. You can
download the Kaggle data set to follow along.
col_names = ['pregnant', 'glucose', 'bp', 'skin', 'insulin', 'bmi', 'pedigree', 'age', 'label']
# load dataset
Code Explanation:
This code creates a list of column names called col_names which will be used to label the columns of a
dataset. Then, it loads a dataset called "diabetes.csv" into a Pandas DataFrame called pima. The
header=None argument specifies that the dataset does not have a header row, and the names=col_names
argument assigns the column names from the col_names list to the DataFrame.
pima.head()
The pima.head() function is used to display the first few rows of the dataset pima. This is useful for getting
a quick overview of the data and checking if it has been loaded correctly. By default, the head() function
displays the first 5 rows of the dataset, but you can specify a different number of rows to display by passing
an integer argument to the function.
Page
64
lOMoAR cPSD| 31245499
1 1 85 66 29 0 26.6 0.351 31 0
3 1 89 66 23 94 28.1 0.167 21 0
Feature Selection
Here, you need to divide given columns into two types of variables dependent (or target variable) and
independent variable (or feature variables).
X = pima[feature_cols] # Features
Code Explanation:
This code is written in Python, and it is used to split a dataset into features and target variable.
The first line defines a list of feature columns that will be used to create the feature matrix. The list contains the names of
the columns that will be used as features, which are 'pregnant', 'insulin', 'bmi', 'age', 'glucose', 'bp', and 'pedigree'.
The second line creates a feature matrix X by selecting the columns specified in the feature_cols list from the pima dataset.
The feature matrix X will contain the values of the selected columns for each observation in the dataset.
Page
65
lOMoAR cPSD| 31245499
Splitting Data
To understand model performance, dividing the dataset into a training set and a test
set is a good strategy.
Let's split the dataset by using the function train_test_split(). You need to pass three
parameters features; target, and test_set size.
Code Explanation:
This code uses the train_test_split function from the sklearn.model_selection module to split a dataset into
a training set and a test set.
The X and y variables represent the features and target variable of the dataset, respectively. The test_size
parameter is set to 0.3, which means that 30% of the data will be used for testing and 70% will be used for
training. The random_state parameter is set to 1, which ensures that the same random split is generated
each time the code is run.
The function returns four arrays: X_train, X_test, y_train, and y_test. X_train and y_train represent the
training set, while X_test and y_test represent the test set. These arrays can be used to train and evaluate a
machine learning model.
clf = DecisionTreeClassifier()
clf = clf.fit(X_train,y_train)
Page
66
lOMoAR cPSD| 31245499
y_pred = clf.predict(X_test)
Code Explanation:
This code is written in Python and it creates a decision tree classifier object using the
DecisionTreeClassifier() function. Then, it trains the classifier using the fit() method with the training data
X_train and y_train. Finally, it uses the trained classifier to predict the response for the test dataset X_test
and stores the predictions in y_pred.
Accuracy can be computed by comparing actual test set values and predicted values.
print("Accuracy:",metrics.accuracy_score(y_test, y_pred))
Code Explanation:
The metrics.accuracy_score() function is used to calculate the accuracy of a classification model. It takes
two arguments: y_test and y_pred. y_test is the true labels of the test set, and y_pred is the predicted labels
of the test set.
The print() function is used to display the accuracy score on the console. The output will be a string that
says "Accuracy:" followed by the actual accuracy score.
Accuracy: 0.6753246753246753
This code snippet is simply displaying the accuracy score of a model. The value 0.6753246753246753
represents the accuracy score, which is a metric used to evaluate the performance of a machine learning
model. The higher the accuracy score, the better the model is at making correct predictions. There is no
specific programming language mentioned in this code snippet, as it is simply displaying a numerical
value.
We got a classification rate of 67.53%, which is considered as good accuracy. You can improve this
accuracy by tuning the parameters in the decision tree algorithm.
Page
67
lOMoAR cPSD| 31245499
You can use Scikit-learn's export_graphviz function for display the tree within a Jupyter notebook. For
plotting the tree, you also need to install graphviz and pydotplus.
The export_graphviz function converts the decision tree classifier into a dot file, and pydotplus converts
this dot file to png or displayable form on Jupyter.
import pydotplus
dot_data = StringIO()
export_graphviz(clf, out_file=dot_data,
filled=True, rounded=True,
special_characters=True,feature_names = feature_cols,class_names=['0','1'])
graph = pydotplus.graph_from_dot_data(dot_data.getvalue())
graph.write_png('diabetes.png')
Image(graph.create_png())
Code Explanation:
This code snippet is used to visualize a decision tree model created using scikit-learn's clf object.
First, the necessary libraries are imported: export_graphviz from sklearn.tree, StringIO from
sklearn.externals.six, Image from IPython.display, and pydotplus.
Page
68
lOMoAR cPSD| 31245499
Then, a StringIO object is created to store the dot data generated by export_graphviz. The export_graphviz
function is called with the clf object as the first argument and various parameters to customize the
appearance of the tree. The dot data is then written to the StringIO object.
Next, pydotplus is used to create a graph from the dot data stored in the StringIO object. The resulting
graph is saved as a PNG file named "diabetes.png". Finally, the graph is displayed using Image from
IPython.display.
Overall, this code generates a visual representation of a decision tree model, which can be useful for
understanding how the model makes predictions and identifying areas for improvement.
Now Optimizing Decision Tree Performance….?
An expert system is a computer program that is designed to solve complex problems and to provide
decision-making ability like a human expert. It performs this by extracting knowledge from its knowledge
base using the reasoning and inference rules according to the user queries.
The expert system is a part of AI, and the first ES was developed in the year 1970, which was the first
successful approach of artificial intelligence. It solves the most complex issue as an expert by extracting
the knowledge stored in its knowledge base. The system helps in decision making for complex problems
using both facts and heuristics like a human expert. It is called so because it contains the expert
knowledge of a specific domain and can solve any complex problem of that particular domain. These
systems are designed for a specific domain, such as medicine, science, etc.
The performance of an expert system is based on the expert's knowledge stored in its knowledge base. The
more knowledge stored in the KB, the more that system improves its performance. One of the common
examples of an ES is a suggestion of spelling errors while typing in the Google search box.
Below is the block diagram that represents the working of an expert system:
Page
69
lOMoAR cPSD| 31245499
Note: It is important to remember that an expert system is not used to replace the human experts; instead,
it is used to assist the human in making a complex decision. These systems do not have human capabilities
of thinking and work on the basis of the knowledge base of the particular domain.
o DENDRAL: It was an artificial intelligence project that was made as a chemical analysis expert
system. It was used in organic chemistry to detect unknown organic molecules with the help of their
mass spectra and knowledge base of chemistry.
o MYCIN: It was one of the earliest backward chaining expert systems that was designed to find the
bacteria causing infections like bacteraemia and meningitis. It was also used for the
recommendation of antibiotics and the diagnosis of blood clotting diseases.
o PXDES: It is an expert system that is used to determine the type and level of lung cancer. To
determine the disease, it takes a picture from the upper body, which looks like the shadow. This
shadow identifies the type and degree of harm.
o CaDeT: The CaDet expert system is a diagnostic support system that can detect cancer at early
stages.
o High Performance: The expert system provides high performance for solving any type of complex
problem of a specific domain with high efficiency and accuracy.
o Understandable: It responds in a way that can be easily understandable by the user. It can take
input in human language and provides the output in the same way.
Page
70
lOMoAR cPSD| 31245499
o Highly responsive: ES provides the result for any complex query within a very short period of
time.
o User Interface
o Inference Engine
o Knowledge Base
1. User Interface
With the help of a user interface, the expert system interacts with the user, takes queries as an input in a
readable format, and passes it to the inference engine. After getting the response from the inference engine,
it displays the output to the user. In other words, it is an interface that helps a non-expert user to
communicate with the expert system to find a solution.
Page
71
lOMoAR cPSD| 31245499
o The inference engine is known as the brain of the expert system as it is the main processing unit of
the system. It applies inference rules to the knowledge base to derive a conclusion or deduce new
information. It helps in deriving an error-free solution of queries asked by the user.
o With the help of an inference engine, the system extracts the knowledge from the knowledge base.
o Forward Chaining: Forward chaining is a data-driven strategy where the system starts with the
available data (initial facts) and applies rules to derive new conclusions or facts. It proceeds in a
step-by-step manner, applying rules iteratively until no further conclusions can be drawn.
o Process:
o Begin with the known facts.
o Apply applicable rules to these facts.
o Generate new facts or conclusions.
o Continue this process until no more new facts can be inferred.
o Use Cases: Commonly used in systems where data or facts are readily available and the goal is to
derive a conclusion or reach a goal state.
Page
72
lOMoAR cPSD| 31245499
o Process:
o Start with the goal or conclusion to be proven.
o Identify rules that could lead to this goal.
o Check if the conditions of these rules are satisfied using available facts or by further backward
chaining.
o If necessary, continue this process recursively until either a solution is found or it is determined that
the goal cannot be achieved.
o Use Cases: Often used in diagnostic systems, where the goal is to identify the causes of observed
symptoms or outcomes.
3. Knowledge Base
o The knowledgebase is a type of storage that stores knowledge acquired from the different experts of
the particular domain. It is considered as big storage of knowledge. The more the knowledge base,
the more precise will be the Expert System.
o It is similar to a database that contains information and rules of a particular domain or subject.
o One can also view the knowledge base as collections of objects and their attributes. Such as a Lion
is an object and its attributes are it is a mammal, it is not a domestic animal, etc.
o Factual Knowledge: The knowledge which is based on facts and accepted by knowledge engineers
comes under factual knowledge.
Page
73
lOMoAR cPSD| 31245499
o Heuristic Knowledge: This knowledge is based on practice, the ability to guess, evaluation, and
experiences.
Knowledge Representation: It is used to formalize the knowledge stored in the knowledge base using the
If-else rules.
Knowledge Acquisitions: It is the process of extracting, organizing, and structuring the domain
knowledge, specifying the rules to acquire the knowledge from various experts, and store that knowledge
into the knowledge base.
Here, we will explain the working of an expert system by taking an example of MYCIN ES. Below are
some steps to build an MYCIN:
o Firstly, ES should be fed with expert knowledge. In the case of MYCIN, human experts specialized
in the medical field of bacterial infection, provide information about the causes, symptoms, and
other knowledge in that domain.
o The KB of the MYCIN is updated successfully. In order to test it, the doctor provides a new
problem to it. The problem is to identify the presence of the bacteria by inputting the details of a
patient, including the symptoms, current condition, and medical history.
o The ES will need a questionnaire to be filled by the patient to know the general information about
the patient, such as gender, age, etc.
o Now the system has collected all the information, so it will find the solution for the problem by
applying if-then rules using the inference engine and using the facts stored within the KB.
o In the end, it will provide a response to the patient by using the user interface.
1. Expert: The success of an ES much depends on the knowledge provided by human experts. These
experts are those persons who are specialized in that specific domain.
2. Knowledge Engineer: Knowledge engineer is the person who gathers the knowledge from the
domain experts and then codifies that knowledge to the system according to the formalism.
Page
74
lOMoAR cPSD| 31245499
3. End-User: This is a particular person or a group of people who may not be experts, and working on
the expert system needs the solution or advice for his queries, which are complex.
Page
75