0% found this document useful (0 votes)
43 views75 pages

Introduction to Artificial Intelligence Concepts

Uploaded by

Kadoo Don
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
43 views75 pages

Introduction to Artificial Intelligence Concepts

Uploaded by

Kadoo Don
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 75

lOMoAR cPSD| 31245499

ARTIFICIAL INTELLIGENCE UNIT I

UNIT I

INTRODUCTIO

Introduction–Definition – Future of Artificial Intelligence – Characteristics


of Intelligent Agents–Typical Intelligent Agents – Problem Solving
Approach to Typical AI problems.

1. 1 ARTIFICIAL INTELLIGENCE – AN INTRODUCTION

AI definitions can be categorized into four, they are as follows:


 Systems that think like humans
 Systems that think rationally
 Systems that act like humans
 System that act rationally

Page 1
lOMoAR cPSD| 31245499

ARTIFICIAL INTELLIGENCE UNIT I

1.2.1 Acting Humanly: Turning Test Approach

This approach focuses on building artificial intelligence systems that can act like humans.
The goal is to create systems that can perform tasks such as recognizing speech, recognizing
images, and controlling robots in a human-like manner.
This approach is mainly used in computer vision and robotics, where the goal is to create
systems that can perceive and interact with the physical world in a human-like manner.
Examples:

self-driving cars, facial recognition systems, and gesture-based human-computer interfaces.


a. Self-driving cars: AI systems that can control a vehicle and navigate roads, traffic, and
obstacles in a human-like manner.
b. Facial recognition systems: AI systems that can identify individuals based on their facial
features.
c. Gesture-based human-computer interfaces: AI systems that can interpret and respond to
gestures made by a user.
TURING TEST IN AI
In 1950, Alan Turing introduced a test to check whether a machine can think like a
human or not, this test is known as the Turing Test. In this test, Turing proposed that the

Page 2
lOMoAR cPSD| 31245499

ARTIFICIAL INTELLIGENCE UNIT I

computer can be said to be an intelligent if it can mimic human response under specific
conditions. Turing Test was introduced by Turing in his 1950 paper, "Computing Machinery
and Intelligence," which considered the question, "Can Machine think?"

The Turing test is based on a party game "Imitation game," with some modifications.
This game involves three players in which one player is Computer, another player is human
responder, and the third player is a human Interrogator, who is isolated from other two
players and his job is to find that which player is machine among two of them.

Consider, Player A is a computer, Player B is human, and Player C is an interrogator.


Interrogator is aware that one of them is machine, but he needs to identify this on the basis of
questions and their responses. The conversation between all players is via keyboard and
screen so the result would not depend on the machine's ability to convert words as speech.

The test result does not depend on each correct answer, but only how closely its
responses like a human answer. The computer is permitted to do everything possible to force
a wrong identification by the interrogator.

The questions and answers can be like:


Interrogator: Are you a computer?

Player A (Computer): No

Player B (Human) : No

Interrogator: Convert the decimal 45952 into binary.


Page 3
lOMoAR cPSD| 31245499

ARTIFICIAL INTELLIGENCE UNIT I

Player A: Long pause and give the wrong answer.?

Player B: Long pause and give the wrong answer.

In this game, if an interrogator would not be able to identify which is a machine and
which is human, then the computer passes the test successfully, and the machine is said to be
intelligent and can think like a human.
Features required for a machine to pass the Turing test:
 Natural language processing: NLP is required to communicate with Interrogator in
general human language like English.
 Knowledge representation: To store and retrieve information during the test.
 Automated reasoning: To use previously stored information to answer the questions.
 Machine learning: To adapt new changes and can detect generalized patterns.
 Vision: To recognize the interrogator actions and other objects during a test.

1.2.2 Thinking Humanly: The Cognitive Modeling Approach


This approach focuses on building artificial intelligence systems that can think like a human.
The goal is to create systems that can understand human language, emotions, and culture and
can interact with humans in a natural way.
This approach is mainly used in the development of conversational AI systems, such as
chatbots and virtual assistants, that need to understand and respond to natural language input
from humans.

Examples:
a. Siri, Alexa, and Google Assistant: Virtual assistants that can understand and respond to
natural language input from users.
b. Chatbots: AI systems that can have conversations with humans using natural language
processing techniques.
c. Emotion recognition systems: AI systems that can detect emotions in human speech and

facial expressions.

1.2.3 Thinking Rationally


This approach focuses on building artificial intelligence systems that can reason logically
and make decisions based on information and rules. The goal is to create systems that can
solve problems and make decisions in a way that is consistent with the principles of rational
thinking.
This approach is used in a wide range of applications, including decision-making, planning,
Page 4
lOMoAR cPSD| 31245499

ARTIFICIAL INTELLIGENCE UNIT I

and problem-solving. For example, expert systems, recommendation systems, and


optimization algorithms.
Examples:
a. Expert systems: AI systems that can make decisions and provide advice based on a set of
rules and knowledge.
b. Recommendation systems: AI systems that can provide personalized recommendations to
users based on their preferences and behavior.
c. Optimization algorithms: Optimization algorithms are computational methods that aim to
find the best solution among a set of possible solutions to a specific problem.

1.2.4 Acting Rationally


This approach is mainly used in artificial intelligence systems that need to make decisions and
take actions to achieve their goals in a rational and efficient manner. For example, autonomous
agents, recommendation systems, and reinforcement learning algorithms.

Examples:
a. Autonomous agents: AI systems that can make decisions and take actions to achieve their
goals in an efficient and effective manner. Autonomous intelligence is artificial intelligence
(AI) that can act without human intervention, input, or direct supervision. It's considered the
most advanced type of artificial intelligence. Examples may include smart manufacturing
robots, self-driving cars, or care robots for the elderly.

b. Reinforcement learning algorithms: AI systems that can learn to take actions in an


environment by receiving rewards and punishments based on their decisions. In Reinforcement
Learning, the agent learns automatically using feedbacks without any labeled data

c. Game AI: AI systems that can play games such as chess, Go, or poker and make decisions
based on the rules and objectives of the game.
1.3 IMPORTANCE AND PURPOSE OF ARTIFICIAL INTELLIGENCE
Before Learning about Artificial Intelligence, we should know that what is the importance
of AI and why should we learn it. Following are some main reasons to learn about AI:

 With the help of AI, you can create such software or devices which can solve real-
world problems very easily and with accuracy such as health issues, marketing, traffic
issues, etc.
 With the help of AI, you can create your personal virtual Assistant, such as Cortana,
Google Assistant, Siri, etc.

Page 5
lOMoAR cPSD| 31245499

ARTIFICIAL INTELLIGENCE UNIT I

 With the help of AI, you can build such Robots which can work in an environment
where survival of humans can be at risk.
 AI opens a path for other new technologies, new devices, and new Opportunities.

1.3.1 Goals of Artificial Intelligence

Following are the main goals of Artificial Intelligence:

1. Replicate human intelligence


2. Solve Knowledge-intensive tasks ("Solving knowledge-intensive tasks" means an AI system
can handle complex problems that need a lot of expert knowledge and reasoning.)
3. An intelligent connection of perception and action (A smart link between sensing and doing.)
4. Building a machine which can perform tasks that requires human intelligence such as:

 Proving a theorem
 Playing chess
 Plan some surgical operation
 Driving a car in traffic

5. Creating some system which can exhibit intelligent behavior, learn new things by
itself, demonstrate, explain, and can advise to its user.

1.3.2 Discipline of Artificial Intelligence

To create the AI first we should know that how intelligence is composed, so the
Intelligence is an intangible part of our brain which is a combination of Reasoning, learning,
problem-solving perception, language understanding, etc. To achieve the above factors
for a machine or software Artificial Intelligence requires the following discipline:

Page 6
lOMoAR cPSD| 31245499

ARTIFICIAL INTELLIGENCE UNIT I

1.3.3 Advantages of Artificial Intelligence

Following are some main advantages of Artificial Intelligence:


 High Accuracy with fewer errors: AI machines or systems are likely to less errors
and high accuracy as it takes decisions as per pre-experience or information.
 High-Speed: AI systems can be of very high-speed and fast-decision making, because
of that AI systems can beat a chess champion in the Chess game.
 High reliability: AI machines are highly reliable and can perform the same action
multiple times with high accuracy.
 Useful for risky areas: AI machines can be helpful in situations such as defusing a
bomb, exploring the ocean floor, where to employ a human can be risky.
 Digital Assistant: AI can be very useful to provide digital assistant to the users such
as AI technology is currently used by various E-commerce websites to show the
products as per customer requirement.
 Useful as a public utility: AI can be very useful for public utilities such as a self-
driving car which can make our journey safer and hassle-free, facial recognition for
security purpose, Natural language processing to communicate with the human in
human-language, etc.

Page 7
lOMoAR cPSD| 31245499

ARTIFICIAL INTELLIGENCE UNIT I

1.3.4 Disadvantages of Artificial Intelligence


Every technology has some disadvantages, and the same goes for Artificial
intelligence. Being so advantageous technology still, it has some disadvantages which we
need to keep in our mind while creating an AI system. Following are the disadvantages of AI:
 High Cost: The hardware and software requirement of AI is very costly as it requires
lots of maintenance to meet current world requirements.
 Can't think out of the box: Even we are making smarter machines with AI, but still
they cannot work out of the box, as the robot will only do that work for which they
are trained, or programmed.
 No feelings and emotions: AI machines can be an outstanding performer, but still it
does not have the feeling so it cannot make any kind of emotional attachment with
human, and may sometime be harmful for users if the proper care is not taken.
 Increase dependency on machines: With the increment of technology, people are
getting more dependent on devices and hence they are losing their mental capabilities.
 No Original Creativity: As humans are so creative and can imagine some new ideas
but still AI machines cannot beat this power of human intelligence and cannot be
creative and imaginative.

Page 8
lOMoAR cPSD| 31245499

ARTIFICIAL INTELLIGENCE UNIT I

1.3.5 Application of AI
Artificial Intelligence has various applications in today's society. It is becoming
essential for today's time because it can solve complex problems with an efficient way in
multiple industries, such as Healthcare, entertainment, finance, education, etc. AI is making
our daily life more comfortable and fast. Following are some sectors which have the
application of Artificial Intelligence:
AI in Astronomy
Artificial Intelligence can be very useful to solve complex universe problems. AI
technology can be helpful for understanding the universe such as how it works, origin,
etc.
AI in Healthcare
 In the last, five to ten years, AI becoming more advantageous for the healthcare
industry and going to have a significant impact on this industry.
 Healthcare Industries are applying AI to make a better and faster diagnosis than
humans. AI can help doctors with diagnoses and can inform when patients are
worsening so that medical help can reach to the patient before hospitalization.
AI in Gaming
AI can be used for gaming purpose. The AI machines can play strategic games like
chess, where the machine needs to think of a large number of possible places.
AI in Finance
AI and finance industries are the best matches for each other. The finance industry is
implementing automation, chatbot, adaptive intelligence, algorithm trading, and
machine learning into financial processes.
AI in Data Security
 The security of data is crucial for every company and cyber-attacks are growing very
rapidly in the digital world. AI can be used to make your data more safe and secure.
Some examples such as AEG bot, AI2 Platform, are used to determine software bug
and cyber-attacks in a better way.
AI in Social Media
 Social Media sites such as Facebook, Twitter, and Snap chat contain billions of user
profiles, which need to be stored and managed in a very efficient way. AI can
organize and manage massive amounts of data. AI can analyze lots of data to identify
the latest trends, hash tag, and requirement of different users.

Page 9
lOMoAR cPSD| 31245499

ARTIFICIAL INTELLIGENCE UNIT I

AI in Travel &Transport
 AI is becoming highly demanding for travel industries. AI is capable of doing various
travel related works such as from making travel arrangement to suggesting the hotels,
flights, and best routes to the customers. Travel industries are using AI-powered
chatbots which can make human-like interaction with customers for better and fast
response.
AI in Automotive Industry
 Some Automotive industries are using AI to provide virtual assistant to their user for
better performance. Such as Tesla has introduced TeslaBot, an intelligent virtual
assistant.
 Various Industries are currently working for developing self-driven cars which can
make your journey more safe and secure.
AI in Robotics
 Artificial Intelligence has a remarkable role in Robotics. Usually, general robots are
programmed such that they can perform some repetitive task, but with the help of AI,
we can create intelligent robots which can perform tasks with their own experiences
without pre-programmed.
 Humanoid Robots are best examples for AI in robotics, recently the intelligent
Humanoid robot named as Erica and Sophia has been developed which can talk and
behave like humans.
AI in Entertainment
 We are currently using some AI based applications in our daily life with some
entertainment services such as Netflix or Amazon. With the help of ML/AI
algorithms, these services show the recommendations for programs or shows.
AI in Agriculture
 Agriculture is an area which requires various resources, labor, money, and time for
best result. Now a day's agriculture is becoming digital, and AI is emerging in this
field. Agriculture is applying AI as agriculture robotics, solid and crop monitoring,
predictive analysis. AI in agriculture can be very helpful for farmers.
AI in E-commerce
 AI is providing a competitive edge to the e-commerce industry, and it is becoming
more demanding in the e-commerce business. AI is helping shoppers to discover
associated products with recommended size, color, or even brand.

Page
10
lOMoAR cPSD| 31245499

ARTIFICIAL INTELLIGENCE UNIT I

AI in education
 AI can automate grading so that the tutor can have more time to teach. AI chatbot can
communicate with students as a teaching assistant.
 AI in the future can be work as a personal virtual tutor for students, which will be
accessible easily at any time and any place.
1.4 HISTORY AND FUTURE OF ARTIFICIAL INTELLIGENCE
Artificial Intelligence is not a new word and not a new technology for researchers.
This technology is much older than you would imagine. Even there are the myths of
Mechanical men in Ancient Greek and Egyptian Myths. Following are some milestones in the
history of AI which defines the journey from the AI generation to till date development.

Page
11
lOMoAR cPSD| 31245499

ARTIFICIAL INTELLIGENCE UNIT I

1.4.1 Maturation of Artificial Intelligence (1943-1952)

 Year 1943: The first work which is now recognized as AI was done by Warren
McCulloch and Walter pits in 1943. They proposed a model of artificial neurons.
 Year 1949: Donald Hebb demonstrated an updating rule for modifying the
connection strength between neurons. His rule is now called Hebbian learning.

 Year 1950: The Alan Turing who was an English mathematician and pioneered
Machine learning in 1950. Alan Turing publishes "Computing Machinery and
Intelligence" in which he proposed a test. The test can check the machine's ability to
exhibit intelligent behavior equivalent to human intelligence, called a Turing test.

1.4.2 .2 The birth of Artificial Intelligence (1952-1956)

 Year 1955: An Allen Newell and Herbert A. Simon created the "first artificial
intelligence program"Which was named as "Logic Theorist". This program had
proved 38 of 52 Mathematics theorems, and find new and more elegant proofs for
some theorems.
 Year 1956: The word "Artificial Intelligence" first adopted by American Computer

Page
12
lOMoAR cPSD| 31245499

ARTIFICIAL INTELLIGENCE UNIT I

scientist at the Dartmouth Conference. For the first time, AI coined as


an academic field. At that time high-level computer languages such as FORTRAN,
LISP, or COBOL were invented. And the enthusiasm for AI was very high at that
time.

1.4.3 .3 The golden years-Early enthusiasm (1956-1974)

 Year 1966: The researchers emphasized developing algorithms which can solve
mathematical problems. Joseph Weizenbaum created the first chatbot in 1966, which
was named as ELIZA.

 Year 1972: The first intelligent humanoid robot was built in Japan which was named
as WABOT-1.

1.4.4 .4 The first AI winter (1974-1980)

 The duration between years 1974 to 1980 was the first AI winter duration. AI winter
refers to the time period where computer scientist dealt with a severe shortage of
funding from government for AI researches.
 During AI winters, an interest of publicity on artificial intelligence was decreased.
1.4.5 A boom of AI (1980-1987)

 Year 1980: After AI winter duration, AI came back with "Expert System". Expert
systems were programmed that emulate the decision-making ability of a human
expert.

 In the Year 1980, the first national conference of the American Association of
Artificial Intelligence was held at Stanford University.

1.4.6 second AI winter (1987-1993)

 The duration between the years 1987 to 1993 was the second AI Winter duration.
 Again Investors and government stopped in funding for AI research as due to high
cost but not efficient result. The expert system such as XCON was very cost effective.

1.4.7 .7 The emergence of intelligent agents (1993-2011)

 Year 1997: In the year 1997, IBM Deep Blue beats world chess champion, Gary
Kasparov, and became the first computer to beat a world chess
champion.
Page
13
lOMoAR cPSD| 31245499

ARTIFICIAL INTELLIGENCE UNIT I

 Year 2002: for the first time, AI entered the home in the form of Roomba, a vacuum
cleaner.
 Year 2006: AI came in the Business world till the year 2006. Companies like
Facebook, Twitter, and Netflix also started using AI.

1.4.8 .8 Deep learning, big data and artificial general intelligence (2011-present)

 Year 2011: In the year 2011, IBM's Watson won jeopardy, a quiz show, where it had
to solve the complex questions as well as riddles. Watson had proved that it could
understand natural language and can solve tricky questions quickly.

 Year 2012: Google has launched an Android app feature "Google now", which was
able to provide information to the user as a prediction.
 Year 2014: In the year 2014, Chatbot "Eugene Goostman" won a competition in the
infamous "Turing test."
 Year 2018: The "Project Debater" from IBM debated on complex topics with two
master debaters and also performed extremely well.

 Google has demonstrated an AI program "Duplex" which was a virtual assistant and
which had taken hairdresser appointment on call, and lady on other side didn't notice
that she was talking with the machine.

Now AI has developed to a remarkable level. The concept of Deep learning, big data, and
data science are now trending like a boom. Nowadays companies like Google, Facebook,
IBM, and Amazon are working with AI and creating amazing devices. The future of Artificial
Intelligence is inspiring and will come with high intelligence.

1.4.9 Future of artificial intelligence


Autonomous Transportation:
In future, enhanced automated transportation the technology will evolve and we will see
in our roads replicas from Back to the Future, where transportations like public buses, cabs,
and even private vehicles will go driverless and on autopilot. With more precision, smart
vehicles will take over the roads and pave way for safer, faster and economical transport
systems.

Robots into Risky Jobs:


Today, some of the most dangerous jobs are done by humans. Right from cleaning

Page
14
lOMoAR cPSD| 31245499

ARTIFICIAL INTELLIGENCE UNIT I

sewage to fighting fire and diffusing bombs, it?s we who get down, get our hands dirty and
risk our lives. The number of human lives we lose is also very high in these processes. In the
near future, we can expect machines or robots to take care of them. As artificial intelligence
evolves and smarter robots roll out, we can see them replacing humans at some of the riskiest
jobs in the world. That?s the only time we expect automation to take away jobs.

Personal Assistants:
Virtual assistants are already there and some of us would’ve used them. However, as the
technology grows, we can expect them to act as personal assistants and emote like humans.
With artificial intelligence, deep learning, and neural networks, it?s highly possible that we
can make robots emote and make them assistants. They could be used in tons of different
purposes such as in hospitality industry, day care centers, elder care, in clerical jobs and
more.

1.5 AGENT

An agent is anything that can viewed as perceiving its environment through sensors
and acting upon that environment through effectors. An Agent runs in the cycle of perceiving,
thinking, and acting those inputs and display output on the screen.
Hence the world around us is full of agents such as thermostat, cellphone, camera, and
even we are also agents.
Sensor: Sensor is a device which detects the change in the environment and sends the
information to other electronic devices. An agent observes its environment through sensors.
Actuators: Actuators are the component of machines that converts energy into
motion. The actuators are only responsible for moving and controlling a system. An actuator
can be an electric motor, gears, rails, etc.
Effectors: Effectors are the devices which affect the environment. Effectors can be
legs, wheels, arms, fingers, wings, fins, and display screen.

Page
15
lOMoAR cPSD| 31245499

ARTIFICIAL INTELLIGENCE UNIT I

An AI system can be defined as the study of the rational agent and its environment.
The agents sense the environment through sensors and act on their environment through
actuators. An AI agent can have mental properties such as knowledge, belief, intention, etc.
An agent can be:
Human-Agent: A human agent has eyes, ears, and other organs which work for sensors and
hand, legs, vocal tract work for actuators.

Robotic Agent: A robotic agent can have cameras, infrared range finder, NLP for sensors
and various motors for actuators.

Page
16
lOMoAR cPSD| 31245499

ARTIFICIAL INTELLIGENCE UNIT I

1.6 INTELLIGENT AGENTS

An intelligent agent is an autonomous entity which acts upon an environment using


sensors and actuators for achieving goals. An intelligent agent may learn from the
environment to achieve their goals. A thermostat is an example of an intelligent agent.
Following are the main four rules for an AI agent:
 Rule 1: An AI agent must have the ability to perceive the environment.
 Rule 2: The observation must be used to make decisions.
 Rule 3: Decision should result in an action.
 Rule 4: The action taken by an AI agent must be a rational action.

Types of AI Agents
Agents can be grouped into five classes based on their degree of perceived intelligence and capability. All
these agents can improve their performance and generate better action over the time. These are given
below:
o Simple Reflex Agent

o Model-based reflex agent

o Goal-based agents

o Utility-based agent

o Learning agent

1. Simple Reflex agent:


o The Simple reflex agents are the simplest agents. These agents take decisions on the basis of the
current percepts and ignore the rest of the percept history.
o These agents only succeed in the fully observable environment.

o The Simple reflex agent does not consider any part of percepts history during their decision and
action process.
o The Simple reflex agent works on Condition-action rule, which means it maps the current state to
action. Such as a Room Cleaner agent, it works only if there is dirt in the room.
o Problems for the simple reflex agent design approach:

o They have very limited intelligence

o They do not have knowledge of non-perceptual parts of the current state

o Mostly too big to generate and to store.

o Not adaptive to changes in the environment.


Page
17
lOMoAR cPSD| 31245499

ARTIFICIAL INTELLIGENCE UNIT I

1.6.1 Structure of an AI Agent

The task of AI is to design an agent program which implements the agent function.
The structure of an intelligent agent is a combination of architecture and agent program. It can
be viewed as:
Agent = Architecture + Agent program
The following are the main three terms involved in the structure of an AI agent: Architecture:

Architecture is machinery that an AI agent executes on. Agent Function: Agent function is

used to map a percept to an action.

F : P* → A

Agent program: Agent program is an implementation of agent function. An agent


program executes on the physical architecture to produce function F.

1.6.2 PEAS Representation


PEAS is a type of model on which an AI agent works upon. When we define an AI agent
or rational agent, then we can group its properties under PEAS representation model. It is
made up of four words:
 P: Performance measure
 E: Environment

Page
18
lOMoAR cPSD| 31245499

ARTIFICIAL INTELLIGENCE UNIT I

 A: Actuators
 S: Sensors
Here performance measure is the objective for the success of an agent's behavior.
PEAS for self-driving cars:
Let’s suppose a self-driving car then PEAS representation will be:
 Performance: Safety, time, legal drive, comfort
 Environment: Roads, other vehicles, road signs, pedestrian
 Actuators: Steering, accelerator, brake, signal, horn
 Sensors: Camera, GPS, speedometer, odometer, accelerometer, sonar.

Example of Agents with their PEAS representation

Performance
Agent Environment Actuators Sensors
measure

 Patient  Keyboard
1. Medical  Patient Health  Tests
 Hospital (Entry of
Diagnose  Disease diagnoses  Treatments
 Staff symptoms)

Page
19
lOMoAR cPSD| 31245499

ARTIFICIAL INTELLIGENCE UNIT I

Performance
Agent Environment Actuators Sensors
measure

 Camera
 Dirt detection
 Room
 Cleanness  Wheels sensor
 Table
2. Vacuum  Efficiency  Brushes  Cliff sensor
 Wood floor
Cleaner  Battery life  Vacuum  Bump Sensor
 Carpet
Extractor  Infrared Wall
 Various obstacles
Sensor

 Camera
 Percentage of  Conveyor belt
3. Part -  Jointed Arms  Joint angle
parts in correct with parts,
picking Robot  Hand sensors.
bins.  Bins

Problem Solving Agents in Artificial Intelligence


Are you curious to know how machines can solve complex problems, just like humans? Enter the
world of artificial intelligence and meet one of its most critical players- the Problem-Solving
Agent. Problem-solving in artificial intelligence can be quite complex, requiring the use of
multiple algorithms and data structures. One critical player is the Problem-Solving Agent, which
helps machines find solutions to problems. In this lecture, we’ll explore what a problem-solving
agent is, how it works in AI systems and some exciting real-world applications that showcase its
potential.

What is Problem Solving Agent?

Problem-solving in artificial intelligence is the process of finding a solution to a problem. There


are many different types of problems that can be solved, and the methods used will depend on the
specific problem. The most common type of problem is finding a solution to a maze or navigation
puzzle.

Other types of problems include identifying patterns, predicting outcomes, and determining
solutions to systems of equations. Each type of problem has its own set of techniques and tools
that can be used to solve it.

There are three main steps in problem-solving in artificial intelligence:


Page
20
lOMoAR cPSD| 31245499

ARTIFICIAL INTELLIGENCE UNIT I

1) understanding the problem: This step involves understanding the specifics of the problem and
figuring out what needs to be done to solve it.

2) generating possible solutions: This step involves coming up with as many possible solutions as
possible based on information about the problem and what you know about how computers work.

3) choosing a solution: This step involves deciding which solution is best based on what you know
about the problem and your options for solving it.

Types of Problem-Solving Agents

Problem-solving agents are a type of artificial intelligence that helps automate problem-solving.
They can be used to solve problems in natural language, algebra, calculus, statistics, and machine
learning.

There are three types of problem-solving agents: propositional, predicate, and automata.
Propositional problem-solving agents can understand simple statements like “draw a line between
A and B” or “find the maximum value of x.” Predicate problem-solving agents can understand
more complex statements like “find the shortest path between two points” or “find all pairs of
snakes in a jar.” Automata is the simplest form of problem-solving agent and can only understand
sequences of symbols like “draw a square.”

The problem solving agent follows this four phase problem solving process:

1. Goal Formulation: This is the first and most basic phase in problem solving. It arranges
specific steps to establish a target/goal that demands some activity to reach it. AI agents are
now used to formulate goals.

2. Problem Formulation: It is one of the fundamental steps in problem-solving that


determines what action should be taken to reach the goal.

3. Search: After the Goal and Problem Formulation, the agent simulates sequences of actions
and has to look for a sequence of actions that reaches the goal. This process is
called search, and the sequence is called a solution. The agent might have to simulate
multiple sequences that do not reach the goal, but eventually, it will find a solution, or it
will find that no solution is possible. A search algorithm takes a problem as input and
outputs a sequence of actions.

4. Execution: After the search phase, the agent can now execute the actions that are
recommended by the search algorithm, one at a time. This final stage is known as the
execution phase.
Page
21
lOMoAR cPSD| 31245499

ARTIFICIAL INTELLIGENCE UNIT I

Components to formulate the associated problem:


1. Initial State

2. Actions

3. Transition Model

4. Goal Test

5. Path Cost

Initial State

It is the agent’s starting state or initial step towards its goal. For example, if a taxi agent needs to
travel to a location(B), but the taxi is already at location(A), the problem’s initial state would be
the location (A).

Actions

It is a description of the possible actions that the agent can take. Given a state s, Actions(s)
returns the actions that can be executed in s. Each of these actions is said to be appropriate in s.

Transition Model (Successor function)

It describes what each action does. It is specified by a function Result(s, a) that returns the state
that results from doing action an in state s.

The initial state, actions, and transition model together define the state space (A state space is a
set of all possible states that it can reach from the current state. The nodes of a state space
represent states, and the arcs connecting them represent actions. A path is a set of states and the
actions that link them in the state space.) of a problem, a set of all states reachable from the initial
state by any sequence of actions. The state space forms a graph in which the nodes are states, and
the links between the nodes are actions.

Goal Test

It determines if the given state is a goal state. Sometimes there is an explicit list of potential goal
states, and the test merely verifies whether the provided state is one of them. The goal is
sometimes expressed via an abstract attribute rather than an explicitly enumerated set of
conditions.

Path Cost

Page
22
lOMoAR cPSD| 31245499

ARTIFICIAL INTELLIGENCE UNIT I

It assigns a numerical cost to each path that leads to the goal. The problem solving agents choose
a cost function that matches its performance measure. Remember that the optimal solution has
the lowest path cost of all the solutions.

Example Problems

The problem-solving approach has been used in a wide range of work contexts. There are two
kinds of problem approaches

1. Standardized/ Toy Problem: Its purpose is to demonstrate or practice various problem


solving techniques. It can be described concisely and precisely, making it appropriate as a
benchmark for academics to compare the performance of algorithms.

2. Real-world Problems: It is real-world problems that need solutions. It does not rely on
descriptions, unlike a toy problem, yet we can have a basic description of the issue.

Some Standardized/Toy Problems

8 Puzzle Problem

In a sliding-tile puzzle, a number of tiles (sometimes called blocks or pieces) are arranged in a
grid with one or more blank spaces so that some of the tiles can slide into the blank space. One variant
is the Rush Hour puzzle, in which cars and trucks slide around a 6 x 6 grid in an attempt to free a car
from the traffic jam. Perhaps the best-known variant is the 8- puzzle (see Figure below), which
consists of a 3 x 3 grid with eight numbered tiles and one blank space. The object is to reach a
specified goal state, such as the one shown on the right of the figure. The standard formulation of the
8 puzzles is as follows:

Vacuum World Problem


Page
23
lOMoAR cPSD| 31245499

ARTIFICIAL INTELLIGENCE UNIT I

Let us take a vacuum cleaner agent and it can move left or right and its jump is to suck up the dirt
from the floor.
A grid world problem is a two-dimensional rectangular array of square cells through which agents
can move. Typically, the agent can go to any nearby cell that is clear of obstacles, either horizontally
or vertically, and in rare cases diagonally. A wall or other impassible obstruction in a cell prohibits an
agent from moving inside that cell.
The vacuum world’s problem can be stated as follows:

Searching Strategies: Introduction

Searching is a process to find the solution for a given set of problems. This in artificial intelligence
can be done by using uninformed searching strategies of either informed searching strategies.

Uninformed searches, also known as blind searches, are search algorithms that explore a problem
space without using any specific knowledge or heuristics about the problem domain. They operate
in a brute force, meaning they try out every part of search space (The space of all feasible
solutions (the set of solutions among which the desired solution resides) is called search space )

Page
24
lOMoAR cPSD| 31245499

ARTIFICIAL INTELLIGENCE UNIT I

blindly and a brute force algorithm is a simple, comprehensive search strategy that systematically
explores every option until a problem’s answer is discovered.

Examples of uninformed search algorithms

Uninformed searches rely solely on the given problem definition and operate systematically to find
a solution. Examples of uninformed search algorithms include breadth-first search (BFS), depth-
first search (DFS),uniform-cost search (UCS), depth-limited search , and iterative deepening depth-
first search. Although all these examples work in a brute force way, they differ in the way they
traverse the nodes.

Informed Search Algorithms

So far we have talked about the uninformed search algorithms which looked through search space
for all possible solutions of the problem without having any additional knowledge about search
space. But informed search algorithm contains an array of knowledge such as how far we are from
the goal, path cost, how to reach to goal node, etc. This knowledge help agents to explore less to the
search space and find more efficiently the goal node.

Page
25
lOMoAR cPSD| 31245499

ARTIFICIAL INTELLIGENCE UNIT I

What is Heuristics?

A heuristic is a technique that is used to solve a problem faster than the classic methods. These
techniques are used to find the approximate solution of a problem when classical methods do not.
Heuristics are said to be the problem-solving techniques that result in practical and quick solutions.

Psychologists Daniel Kahneman and Amos Tversky have developed the study of Heuristics in
human decision-making in the 1970s and 1980s. However, this concept was first introduced by
the Nobel Laureate Herbert A. Simon, whose primary object of research was problem-solving.

Why do we need heuristics?

Heuristics are used in situations in which there is the requirement of a short-term solution. On
facing complex situations with limited resources and time, Heuristics can help the companies to
make quick decisions by shortcuts and approximated calculations. Most of the heuristic methods
involve mental shortcuts to make decisions on past experiences.

The heuristic method might not always provide us the finest solution, but it is assured that it helps
us find a good solution in a reasonable time.

Best First Search Algorithm in AI:

The best first search uses the concept of a priority queue and heuristic search. It is a search
algorithm that works on a specific rule. The aim is to reach the goal from the initial state via the
shortest path. The best First Search algorithm in artificial intelligence is used for finding the
shortest path from a given starting node to a goal node in a graph. The algorithm works by

Page
26
lOMoAR cPSD| 31245499

ARTIFICIAL INTELLIGENCE UNIT I

expanding the nodes of the graph in order to increase the distance from the starting node until the
goal node is reached.

If we consider searching as a form of traversal in a graph, an uninformed search algorithm would


blindly traverse to the next node in a given manner without considering the cost associated with that
step. An informed search, like BFS, on the other hand, would use an evaluation function to decide
which among the various available nodes is the most promising (or ‘BEST’) before traversing to
that node.

BFS uses the concept of a Priority queue and heuristic search. To search the graph space, the BFS
method uses two lists for tracking the traversal. An ‘Open’ list that keeps track of the current
‘immediate’ nodes available for traversal and a ‘CLOSED’ list that keeps track of the nodes already

traversed.

Page
27
lOMoAR cPSD| 31245499

ARTIFICIAL INTELLIGENCE UNIT I

A Star (A*) Search Algorithm

A * algorithm is a searching algorithm that searches for the shortest path between the initial and the final
state. It is used in various applications, such as maps.

In maps the A* algorithm is used to calculate the shortest distance between the source (initial state) and the
destination (final state).

How it works
Imagine a square grid which possesses many obstacles, scattered randomly. The initial and the final cell is
provided. The aim is to reach the final cell in the shortest amount of time.

Explanation
The core of the A* algorithm is based on cost functions and heuristics. It uses two main parameters:

 g : the cost of moving from the initial cell to the current cell. Basically, it is the sum of all the cells
that have been visited since leaving the first cell.

 h : also known as the heuristic value, it is the estimated cost of moving from the current cell to the
final cell. The actual cost cannot be calculated until the final cell is reached. Hence, h is the
estimated cost. We must make sure that there is never an over estimation of the cost.

 f : it is the sum of g and h. So, f = g + h

The way that the algorithm makes its decisions is by taking the f-value into account. The algorithm selects
the smallest f-valued cell and moves to that cell. This process continues until the algorithm reaches its goal cell.

Page
28
lOMoAR cPSD| 31245499

ARTIFICIAL INTELLIGENCE UNIT I

The Various Types Of Uninformed Search Algorithms

Breadth-first Search

Page
29
lOMoAR cPSD| 31245499

ARTIFICIAL INTELLIGENCE UNIT I

Breadth-first search (BFS) is one of the most important uninformed search strategies in artificial
intelligence to explore a search space systematically. BFS explores all the neighboring nodes of the initial
state before moving on to explore their neighbours. This strategy ensures that the shortest path to the goal
is found.

The algorithm works by starting at the initial state and adding all its neighbors to a queue. It then dequeues
the first node in the queue, adds neighbors to the end of the queue, and repeats the process until the goal
state is found or the queue is empty.

Here are the steps for performing BFS in a search space:

BFS explores all the nodes at a given distance (or level) from the starting node before moving on to explore
the nodes at the next distance (or level) from the starting node. This means that BFS visits all the nodes that
are closest to the starting node before moving on to nodes that are farther away.

We use the queue data structure to implement BFS.

 Add the initial state to a queue.


 While the queue is not empty, dequeue the first node.
 If the node is the goal state, return it.
 If the node is not the goal state, add all its neighbors to the end of the queue.
 Repeat steps 2-4 until the goal state is found or the queue is empty.

Advantages

Breadth-first search (BFS) is an algorithm used in artificial intelligence to explore a search space
systematically. Some advantages of BFS include the following:

Page
30
lOMoAR cPSD| 31245499

ARTIFICIAL INTELLIGENCE UNIT I

 Completeness:
BFS is guaranteed to find the goal state if it exists in the search space, provided the branching factor
is finite.
 Optimal solution:
BFS is guaranteed to find the shortest path to the goal state, as it explores all nodes at the same
depth before moving on to nodes at a deeper level.
 Simplicity:
BFS is easy to understand and implement, making it a good baseline algorithm for more complex
search algorithms.
 No redundant paths:
BFS does not explore redundant paths because it explores all nodes at the same depth before
moving on to deeper levels.

Disadvantages

 Memory-intensive:
BFS can be memory-intensive for large search spaces because it stores all the nodes at each level in
the queue.
 Time-intensive:
BFS can be time-intensive for search spaces with a high branching factor because it needs to
explore many nodes before finding the goal state.
 Inefficient for deep search spaces:
BFS can be inefficient for search spaces with a deep depth because it needs to explore all nodes at
each depth before moving on to the next level.

Time and Space Complexity

The time and space complexity of breadth-first search (BFS) in artificial intelligence can vary depending
on the size and structure of the search space.

Time complexity:
The time complexity of BFS is proportional to the number of nodes in the search space, as BFS explores all
nodes at each level before moving on to deeper levels. For example, if the goal state is at the deepest level,
BFS must explore all nodes in the search space, resulting in a time complexity of bd, where b is the
branching factor, and d is the depth of the search space.

Space complexity:
The space complexity of BFS is proportional to the maximum number of nodes stored in the queue during
the search. For example, if the search space is a tree, the maximum number of nodes stored in the queue at
any given time is the number of nodes at the deepest level, which is proportional to bd. Therefore, the
space complexity of BFS is O(bd).

Example
Suppose we have a search space with an initial state "A" and a goal state "E" connected by nodes as
follows:

A
/ \
B C
/ \
D E

Page
31
lOMoAR cPSD| 31245499

ARTIFICIAL INTELLIGENCE UNIT I

To perform BFS on this search space, we start by adding the initial state "A" to a queue:

Queue: A
Explored: {}

We dequeue the first node in the queue, which is "A", and add its children "B" and "C" to the end of the
queue:

Queue: B, C
Explored: {A}

We then dequeue "B" and "C" and add their children to the end of the queue:

Queue: C, D
Explored: {A, B}

We dequeue "C" and add its child "E" to the end of the queue:

Queue: D, E
Explored: {A, B, C}

Finally, we dequeue "D" and "E" and find that "E" is the goal state, so we have successfully found a path
from "A" to "E" using BFS.

Depth-first Search

Depth-first search (DFS) is popular among the uninformed search strategies in artificial intelligence to
explore and traverse a graph or tree data structure. The algorithm starts at a given node in the graph and
explores as far as possible along each branch before backtracking.
Page
32
lOMoAR cPSD| 31245499

ARTIFICIAL INTELLIGENCE UNIT I

DFS is a recursive algorithm that follows the following steps:

 Mark the starting node as visited.


 Explore all adjacent nodes that have not been visited.
 For each unvisited adjacent node, repeat steps 1 and 2 recursively.
 Backtrack if all adjacent nodes have been visited or there are no unvisited nodes.
 DFS can be implemented using a stack data structure or recursion. The recursive implementation is
simpler to understand but can cause a stack overflow if the graph or tree is too large.

DFS has several applications in AI, including pathfinding, searching for solutions to a problem, and
exploring the state space of a problem. It is particularly useful when the solution is far from the starting
node because it can explore the graph deeply before exploring widely.

Advantages

 Memory efficiency:
DFS uses less memory than breadth-first search because it only needs to keep track of a single path
at a time.
 Finds a solution quickly:
If the solution to a problem is located deep in a tree, DFS can quickly reach it by exploring one path
until it reaches the solution.
 Easy to implement:
DFS is a simple algorithm to understand and implement, especially when using recursion.
 Can be used for certain types of problems:
DFS is particularly useful for problems that involve searching for a path, such as maze-solving or
finding the shortest path between two nodes in a graph.

Disadvantages

 Can get stuck in infinite loops:


DFS can get stuck in an infinite loop if there are cycles in the graph or tree. This can be avoided by
keeping track of visited nodes.
 May not find the optimal solution:
DFS only sometimes finds the shortest path to a solution. It may find a suboptimal path before
finding the shortest one.
 Can take a long time:
In some cases, DFS may take a long time to find a solution, especially if the solution is located far
from the starting node.

Example
Traversing a binary tree

Consider the following binary tree:

Page
33
lOMoAR cPSD| 31245499

ARTIFICIAL INTELLIGENCE UNIT I


1
/ \
2 3
/ \ / \
4 5 6 7

To traverse this tree using DFS, we start at the root node (1) and explore as far as possible along each
branch before backtracking. Here is the order in which the nodes would be visited.

1 -> 2 -> 4 -> 5 -> 3 -> 6 -> 7

We first visit the root node (1), then the left child (2), and so on. Once we reach a leaf node (4), we
backtrack to the last node with an unexplored child (2) and continue exploring its other child (5). We
continue this process until all nodes have been visited.

Uniform-cost Search Algorithm:

Uniform-cost search is a searching algorithm used for traversing a weighted tree or graph. This
algorithm comes into play when a different cost is available for each edge. The primary goal of the
uniform-cost search is to find a path to the goal node which has the lowest cumulative cost.
Uniform-cost search expands nodes according to their path costs form the root node. It can be used
to solve any graph/tree where the optimal cost is in demand. A uniform-cost search algorithm is
implemented by the priority queue. It gives maximum priority to the lowest cumulative cost.
Uniform cost search is equivalent to BFS algorithm if the path cost of all edges is the same.

Page
34
lOMoAR cPSD| 31245499

ARTIFICIAL INTELLIGENCE UNIT I


Example

Input: Let the graph be as below with source node being A and destination E.

Output:

The shortest path here is A−>B−>C−>E forming the cost 17.

Advantages:

o Uniform cost search is optimal because at every state the path with the least cost is chosen.

Disadvantages:

o It does not care about the number of steps involve in searching and only concerned about path cost.
Due to which this algorithm may be stuck in an infinite loop.

Introduction to Depth Limited Search

Page
35
lOMoAR cPSD| 31245499

ARTIFICIAL INTELLIGENCE UNIT I

Depth limited search is the new search algorithm for uninformed search. The unbounded tree
problem happens to appear in the depth-first search algorithm, and it can be fixed by imposing a
boundary or a limit to the depth of the search domain. We will say that this limit as the depth limit,
making the DFS search strategy more refined and organized into a finite loop. We denote this limit
by l, and thus this provides the solution to the infinite path problem that originated earlier in the
DFS algorithm. Thus, Depth limited search can be called an extended and refined version of the
DFS algorithm. In a nutshell, we can say that to avoid the infinite loop status while executing the
codes, and depth limited search algorithm is being executed into a finite set of depth called depth
limit.

Algorithm
This algorithm essentially follows a similar set of steps as in the DFS algorithm.

1. The start node or node 1 is added to the beginning of the stack.

2. Then it is marked as visited, and if node 1 is not the goal node in the search,

then we push second node 2 on top of the stack.

3. Next, we mark it as visited and check if node 2 is the goal node or not.

4. If node 2 is not found to be the goal node, then we push node 4 on top of the

stack.

5. Now we search in the same depth limit and move along depth-wise to check for

the goal nodes.

6. If Node 4 is also not found to be the goal node and depth limit is found to be

reached, then we retrace back to nearest nodes that remain unvisited or

unexplored.

7. Then we push them into the stack and mark them visited.

8. We continue to perform these steps in iterative ways unless the goal node is

reached or until all nodes within depth limit have been explored for the goal.

Depth-limited search is found to terminate under these two clauses:

Page
36
lOMoAR cPSD| 31245499

ARTIFICIAL INTELLIGENCE UNIT I

1. When the goal node is found to exist.

2. When there is no solution within the given depth limit domain.

Page
37
lOMoAR cPSD| 31245499

ARTIFICIAL INTELLIGENCE UNIT I

Understanding the Confusion Matrix in Machine Learning

Machine learning models are increasingly used in various applications to classify data into different
categories. However, evaluating the performance of these models is crucial to ensure their accuracy and
reliability. One essential tool in this evaluation process is the confusion matrix.

The confusion matrix is a matrix used to determine the performance of the classification models for a given
set of test data. It can only be determined if the true values for test data are known. Since it shows the
errors in the model performance in the form of a matrix, hence also known as an error matrix. Some
features of Confusion matrix are given below:

o For the 2 prediction classes of classifiers, the matrix is of 2*2 table, for 3 classes, it is 3*3 table, and
so on.

o The matrix is divided into two dimensions, that are predicted values and actual values along with
the total number of predictions.

o Predicted values are those values, which are predicted by the model, and actual values are the true
values for the given observations.

o It looks like the below table:

The above table has the following cases:

o True Negative: Model has given prediction No, and the real or actual value was also No.

o True Positive: The model has predicted yes, and the actual value was also true.

o False Negative: The model has predicted no, but the actual value was Yes, it is also called as Type-
II error.

o False Positive: The model has predicted Yes, but the actual value was No. It is also called a Type-I
error.

Page
38
lOMoAR cPSD| 31245499

ARTIFICIAL INTELLIGENCE UNIT I

o With the help of the confusion matrix, we can calculate the different parameters for the model, such
as accuracy, precision, etc.

Example: We can understand the confusion matrix using an example.

Suppose we are trying to create a model that can predict the result for the disease that is either a person has
that disease or not. So, the confusion matrix for this is given as:

From the above example, we can conclude that:

o The table is given for the two-class classifier, which has two predictions "Yes" and "NO." Here,
Yes defines that patient has the disease, and No defines that patient does not has that disease.

o The classifier has made a total of 100 predictions. Out of 100 predictions, 89 are true predictions,
and 11 are incorrect predictions.

o The model has given prediction "yes" for 32 times, and "No" for 68 times. Whereas the actual
"Yes" was 27, and actual "No" was 73 times.

Calculations using Confusion Matrix:

We can perform various calculations for the model, such as the model's accuracy, using this matrix.
These calculations are given below:

o Classification Accuracy: It is one of the important parameters to determine the accuracy of the
classification problems. It defines how often the model predicts the correct output. It can be
calculated as the ratio of the number of correct predictions made by the classifier to all number of
predictions made by the classifiers. The formula is given below:

Page
39
lOMoAR cPSD| 31245499

ARTIFICIAL INTELLIGENCE UNIT I

o Misclassification rate: It is also termed as Error rate, and it defines how often the model gives the
wrong predictions. The value of error rate can be calculated as the number of incorrect predictions
to all number of the predictions made by the classifier. The formula is given below:

o Precision: Precision is a metric used to measure how well a machine learning model predicts
positive classes. It can be defined as the number of correct outputs provided by the model or out of
all positive classes that have predicted correctly by the model, how many of them were actually
true. It can be calculated using the below formula:

o Recall: measures the effectiveness of a classification model in identifying all relevant instances
from a dataset, or Recall is a metric that measures how often a machine learning model correctly
identifies positive instances (true positives) from all the actual positive samples in the dataset. It is
the ratio of the number of true positive (TP) instances to the sum of true positive and false negative
(FN) instances.

o F-measure: If two models have low precision and high recall or vice versa, it is difficult to
compare these models. So, for this purpose, we can use F-score. This score helps us to evaluate the
recall and precision at the same time. The F-score is maximum if the recall is equal to the precision.
It can be calculated using the below formula:

Page
40
lOMoAR cPSD| 31245499

ARTIFICIAL INTELLIGENCE UNIT I

Machine Learning
A rapidly developing field of technology, machine learning allows computers to
automatically learn from previous data. For building mathematical models and making
predictions based on historical data or information, machine learning employs a variety
of algorithms. It is currently being used for a variety of tasks, including speech recognition,
email filtering, auto-tagging on Facebook, a recommender system, and image
recognition.

What is Machine Learning?


In the real world, we are surrounded by humans who can learn everything from their
experiences with their learning capability, and we have computers or machines which
work on our instructions. But can a machine also learn from experiences or past data like
a human does? So here comes the role of Machine Learning.

Introduction to Machine Learning


A subset of artificial intelligence known as machine learning focuses primarily on the
creation of algorithms that enable a computer to independently learn from data and
previous experiences. Arthur Samuel first used the term "machine learning" in 1959. It
could be summarized as follows:

Page
41
lOMoAR cPSD| 31245499

ARTIFICIAL INTELLIGENCE UNIT I

Without being explicitly programmed, machine learning enables a machine to


automatically learn from data, improve performance from experiences, and predict things.
Machine learning algorithms create a mathematical model that, without being explicitly
programmed, aids in making predictions or decisions with the assistance of sample
historical data, or training data. For the purpose of developing predictive models, machine
learning brings together statistics and computer science. Algorithms that learn from
historical data are either constructed or utilized in machine learning.
A machine can learn if it can gain more data to improve its performance.

How does Machine Learning work


A machine learning system builds prediction models, learns from previous data, and
predicts the output of new data whenever it receives it. The amount of data helps to build
a better model that accurately predicts the output, which in turn affects the accuracy of
the predicted output.
Let's say we have a complex problem in which we need to make predictions. Instead of
writing code, we just need to feed the data to generic algorithms, which build the logic
based on the data and predict the output. Our perspective on the issue has changed as a
result of machine learning. The Machine Learning algorithm's operation is depicted in the
following block diagram:
Difference between Machine Learning and Traditional Programming
The Difference between Machine Learning and Traditional Programming is as follows:
Machine Traditional Artificial
Learning Programming Intelligence
Machine Learning is a
Artificial Intelligence
subset of artificial In traditional
involves making the
intelligence(AI) that programming, rule-
machine as much
focus based
capable,
on learning from data code is written by the
So that it can perform
to developers depending
the
develop an algorithm on
tasks that typically
that the problem
require
can be used to make a statements.
human intelligence.
prediction.
Machine Learning uses Traditional AI can involve many
a programming different techniques,
data-driven approach, is typically rule-based including Machine
It is and Learning
typically trained on deterministic. It hasn’t and Deep Learning, as
historical data and self-learning features well
Page
42
lOMoAR cPSD| 31245499

ARTIFICIAL INTELLIGENCE UNIT I

then
like as traditional rule-
used to make
Machine Learning and based
predictions on
AI. programming.
new data.
Sometimes AI uses a
combination of both
Traditional
ML can find patterns Data
programming
and and Pre-defined rules,
is totally dependent on
insights in large which
the
datasets gives it a great edge in
intelligence of
that might be difficult solving complex tasks
developers.
for with
So, it has very limited
humans to discover. good accuracy which
capability.
seem
impossible to humans.
Machine Learning lifecycle:

The lifecycle of a machine learning project involves a series of steps that include:
1. Study the Problems: The first step is to study the problem. This step involves
understanding the business problem and defining the objectives of the model.
2. Data Collection: When the problem is well-defined, we can collect the relevant data
required for the model. The data could come from various sources such as databases,
APIs, or web scraping.

3. Data Preparation: When our problem-related data is collected. then it is a good idea
to check the data properly and make it in the desired format so that it can be used by
the model to find the hidden patterns. This can be done in the following steps:
 Data cleaning

 Data Transformation

 Explanatory Data Analysis and Feature Engineering Split the dataset for training and testing.

4. Model Selection: The next step is to select the appropriate machine learning
algorithm that is suitable for our problem. This step requires knowledge of the strengths
and weaknesses of different algorithms. Sometimes we use multiple models and
compare their results and select the best model as per our requirements.

5. Model building and Training: After selecting the algorithm, we have to build the
model.

Page
43
lOMoAR cPSD| 31245499

ARTIFICIAL INTELLIGENCE UNIT I

1. In the case of traditional machine learning building mode is easy it is just a


few hyper parameter tunings.

2. In the case of deep learning, we have to define layer-wise architecture along


with input and output size, number of nodes in each layer, loss function, gradient
descent optimizer, etc.

3. After that model is trained using the preprocessed dataset.

6. Model Evaluation: Once the model is trained, it can be evaluated on the test dataset
to determine its accuracy and performance using different techniques like classification
report, F1 score, precision, recall, ROC Curve, Mean Square error, absolute error, etc.
7. Model Tuning: Based on the evaluation results, the model may need to be tuned or
optimized to improve its performance. This involves tweaking the hyper parameters of
the model.

8. Deployment: Once the model is trained and tuned, it can be deployed in a production
environment to make predictions on new data. This step requires integrating the model
into an existing software system or creating a new system for the model.
9. Monitoring and Maintenance: Finally, it is essential to monitor the model’s
performance in the production environment and perform maintenance tasks as required.
This involves monitoring for data drift, retraining the model as needed, and updating
the model as new data becomes available.

Various Applications of Machine Learning

Now in this Machine learning tutorial, let’s learn the applications of Machine Learning:
• Automation: Machine learning, which works entirely autonomously in any field
without the need for any human intervention. For example, robots perform the
essential process steps in manufacturing plants.

. Finance Industry: Machine learning is growing in popularity in the finance industry.


Banks are mainly using ML to find patterns inside the data but also to prevent fraud.
• Government organization: The government makes use of ML to manage public
safety and utilities. Take the example of China with its massive face recognition. The
government uses Artificial intelligence to prevent jaywalking.

• Healthcare industry: Healthcare was one of the first industries to use machine
learning with image detection.

Page
44
lOMoAR cPSD| 31245499

ARTIFICIAL INTELLIGENCE UNIT I

• Marketing: Broad use of AI is done in marketing thanks to abundant access to data.


Before the age of mass data, researchers develop advanced mathematical tools like
Bayesian analysis to estimate the value of a customer. With the boom of data, the
marketing department relies on AI to optimize customer relationships and marketing
campaigns.
• Retail industry: Machine learning is used in the retail industry to analyze customer
behavior, predict demand, and manage inventory. It also helps retailers to personalize
the shopping experience for each customer by recommending products based on their
past purchases and preferences.

• Transportation: Machine learning is used in the transportation industry to optimize


routes, reduce fuel consumption, and improve the overall efficiency of transportation
systems. It also plays a role in autonomous vehicles, where ML algorithms are used to
make decisions about navigation and safety.

Classification of Machine Learning

At a broad level, machine learning can be classified into three types:

1. Supervised learning

2. Unsupervised learning

3. Reinforcement learning

Supervised Machine Learning


Supervised learning is the types of machine learning in which machines are trained using well "labelled"
training data, and on basis of that data, machines predict the output. The labelled data means some input
data is already tagged with the correct output.

In supervised learning, the training data provided to the machines work as the supervisor that teaches the
machines to predict the output correctly. It applies the same concept as a student learns in the supervision
of the teacher.

Supervised learning is a process of providing input data as well as correct output data to the machine
learning model. The aim of a supervised learning algorithm is to find a mapping function to map the
input variable(x) with the output variable(y).

Page
45
lOMoAR cPSD| 31245499

ARTIFICIAL INTELLIGENCE UNIT I

In the real-world, supervised learning can be used for Risk Assessment, Image classification, Fraud
Detection, spam filtering, etc.

How Supervised Learning Works?


In supervised learning, models are trained using labelled dataset, where the model learns about each type of
data. Once the training process is completed, the model is tested on the basis of test data (a subset of the
training set), and then it predicts the output.

The working of Supervised learning can be easily understood by the below example and diagram:

Suppose we have a dataset of different types of shapes which includes square, rectangle, triangle, and
Polygon. Now the first step is that we need to train the model for each shape.

o If the given shape has four sides, and all the sides are equal, then it will be labelled as a Square.

o If the given shape has three sides, then it will be labelled as a triangle.

o If the given shape has six equal sides then it will be labelled as hexagon.

Now, after training, we test our model using the test set, and the task of the model is to identify the shape.

The machine is already trained on all types of shapes, and when it finds a new shape, it classifies the shape
on the bases of a number of sides, and predicts the output.

Steps Involved in Supervised Learning:


o First Determine the type of training dataset
Page
46
lOMoAR cPSD| 31245499

ARTIFICIAL INTELLIGENCE UNIT I

o Collect/Gather the labelled training data.

o Split the training dataset into training dataset, test dataset, and validation dataset.

o Determine the input features of the training dataset, which should have enough knowledge so that
the model can accurately predict the output.
o Determine the suitable algorithm for the model, such as support vector machine, decision tree, etc.

o Execute the algorithm on the training dataset. Sometimes we need validation sets as the control
parameters, which are the subset of training datasets.
o Evaluate the accuracy of the model by providing the test set. If the model predicts the correct
output, which means our model is accurate.

Types of supervised Machine learning Algorithms:


Supervised learning can be further divided into two types of problems:

1. Regression

Regression algorithms are used if there is a relationship between the input variable and the output variable.
It is used for the prediction of continuous variables, such as Weather forecasting, Market Trends, etc.
Below are some popular Regression algorithms which come under supervised learning:

o Linear Regression

o Regression Trees

o Non-Linear Regression

o Bayesian Linear Regression

o Polynomial Regression

2. Classification

Page
47
lOMoAR cPSD| 31245499

ARTIFICIAL INTELLIGENCE UNIT I

Classification algorithms are used when the output variable is categorical, which means there are two
classes such as Yes-No, Male-Female, True-false, etc.

Spam Filtering,

o Random Forest

o Decision Trees

o Logistic Regression

o Support vector Machines

Note: We will discuss these algorithms in detail in later chapters.

Advantages of Supervised learning:

o With the help of supervised learning, the model can predict the output on the basis of prior
experiences.
o In supervised learning, we can have an exact idea about the classes of objects.

o Supervised learning model helps us to solve various real-world problems such as fraud detection,
spam filtering, etc.

Disadvantages of supervised learning:

o Supervised learning models are not suitable for handling the complex tasks.

o Supervised learning cannot predict the correct output if the test data is different from the training
dataset.
o Training required lots of computation times.

o In supervised learning, we need enough knowledge about the classes of object.

Unsupervised Machine Learning

In the previous topic, we learned supervised machine learning in which models are trained using labeled
data under the supervision of training data. But there may be many cases in which we do not have labeled
data and need to find the hidden patterns from the given dataset. So, to solve such types of cases in
machine learning, we need unsupervised learning techniques.

What is Unsupervised Learning?

Page
48
lOMoAR cPSD| 31245499

ARTIFICIAL INTELLIGENCE UNIT I

As the name suggests, unsupervised learning is a machine learning technique in which models are not
supervised using training dataset. Instead, models itself find the hidden patterns and insights from the given
data. It can be compared to learning which takes place in the human brain while learning new things. It can
be defined as:

Unsupervised learning is a type of machine learning in which models are trained using unlabeled dataset
and are allowed to act on that data without any supervision.

Unsupervised learning cannot be directly applied to a regression or classification problem because unlike
supervised learning, we have the input data but no corresponding output data. The goal of unsupervised
learning is to find the underlying structure of dataset, group that data according to similarities, and
represent that dataset in a compressed format.

Example: Suppose the unsupervised learning algorithm is given an input dataset containing images of
different types of cats and dogs. The algorithm is never trained upon the given dataset, which means it does
not have any idea about the features of the dataset. The task of the unsupervised learning algorithm is to
identify the image features on their own. Unsupervised learning algorithm will perform this task by
clustering the image dataset into the groups according to similarities between images.

Why use Unsupervised Learning?


Below are some main reasons which describe the importance of Unsupervised Learning:

o Unsupervised learning is helpful for finding useful insights from the data.

o Unsupervised learning is much similar as a human learns to think by their own experiences, which
makes it closer to the real AI.

Page
49
lOMoAR cPSD| 31245499

ARTIFICIAL INTELLIGENCE UNIT I

o Unsupervised learning works on unlabeled and uncategorized data which make unsupervised
learning more important.
o In real-world, we do not always have input data with the corresponding output so to solve such
cases, we need unsupervised learning.

Working of Unsupervised Learning

Working of unsupervised learning can be understood by the below diagram:

Here, we have taken an unlabeled input data, which means it is not categorized and corresponding outputs
are also not given. Now, this unlabeled input data is fed to the machine learning model in order to train it.
Firstly, it will interpret the raw data to find the hidden patterns from the data and then will apply suitable
algorithms such as k-means clustering, Decision tree, etc.

Once it applies the suitable algorithm, the algorithm divides the data objects into groups according to the
similarities and difference between the objects.

Types of Unsupervised Learning Algorithm:

The unsupervised learning algorithm can be further categorized into two types of problems:

Page
50
lOMoAR cPSD| 31245499

ARTIFICIAL INTELLIGENCE UNIT I

o Clustering: Clustering is a method of grouping the objects into clusters such that objects with most
similarities remains into a group and has less or no similarities with the objects of another group.
Cluster analysis finds the commonalities between the data objects and categorizes them as per the
presence and absence of those commonalities.
o Association: An association rule is an unsupervised learning method which is used for finding the
relationships between variables in the large database. It determines the set of items that occurs
together in the dataset. Association rule makes marketing strategy more effective. Such as people
who buy X item (suppose a bread) are also tend to purchase Y (Butter/Jam) item. A typical example
of Association rule is Market Basket Analysis.

Note: We will learn these algorithms in later chapters.

Unsupervised Learning algorithms:

Below is the list of some popular unsupervised learning algorithms:

o K-means clustering

o KNN (k-nearest neighbors)

o Hierarchal clustering

o Anomaly detection

o Neural Networks

o Principle Component Analysis

o Independent Component Analysis

Page
51
lOMoAR cPSD| 31245499

ARTIFICIAL INTELLIGENCE UNIT I

o Apriori algorithm

o Singular value decomposition

Clustering in Machine Learning

Clustering or cluster analysis is a machine learning technique, which groups the unlabelled dataset. It can
be defined as "A way of grouping the data points into different clusters, consisting of similar data
points. The objects with the possible similarities remain in a group that has less or no similarities with
another group."

It does it by finding some similar patterns in the unlabelled dataset such as shape, size, color, behavior, etc.,
and divides them as per the presence and absence of those similar patterns.

It is an unsupervised learning method, hence no supervision is provided to the algorithm, and it deals with
the unlabeled dataset.

After applying this clustering technique, each cluster or group is provided with a cluster-ID. ML system
can use this id to simplify the processing of large and complex datasets.

The clustering technique is commonly used for statistical data analysis.

Note: Clustering is somewhere similar to the classification algorithm, but the difference is the type of
dataset that we are using. In classification, we work with the labeled data set, whereas in clustering, we
work with the unlabelled dataset.

Example: Let's understand the clustering technique with the real-world example of Mall: When we visit
any shopping mall, we can observe that the things with similar usage are grouped together. Such as the t-
shirts are grouped in one section, and trousers are at other sections, similarly, at vegetable sections, apples,
bananas, Mangoes, etc., are grouped in separate sections, so that we can easily find out the things. The
clustering technique also works in the same way. Other examples of clustering are grouping documents
according to the topic.

The clustering technique can be widely used in various tasks. Some most common uses of this technique
are:

o Market Segmentation

o Statistical data analysis


Page
52
lOMoAR cPSD| 31245499

ARTIFICIAL INTELLIGENCE UNIT I

o Social network analysis

o Image segmentation

o Anomaly detection, etc.

Apart from these general usages, it is used by the Amazon in its recommendation system to provide the
recommendations as per the past search of products. Netflix also uses this technique to recommend the
movies and web-series to its users as per the watch history.

The below diagram explains the working of the clustering algorithm. We can see the different fruits are
divided into several groups with similar properties.

Applications of Clustering

Below are some commonly known applications of clustering technique in Machine Learning:

Page
53
lOMoAR cPSD| 31245499

ARTIFICIAL INTELLIGENCE UNIT I

o In Identification of Cancer Cells: The clustering algorithms are widely used for the identification
of cancerous cells. It divides the cancerous and non-cancerous data sets into different groups.

o In Search Engines: Search engines also work on the clustering technique. The search result
appears based on the closest object to the search query. It does it by grouping similar data objects in
one group that is far from the other dissimilar objects. The accurate result of a query depends on the
quality of the clustering algorithm used.

o Customer Segmentation: It is used in market research to segment the customers based on their
choice and preferences.

o In Biology: It is used in the biology stream to classify different species of plants and animals using
the image recognition technique.

Types of Clustering Methods

The clustering methods are broadly divided into Hard clustering (datapoint belongs to only one group)
and Soft Clustering (data points can belong to another group also). But there are also other various
approaches of Clustering exist. Below are the main clustering methods used in Machine learning:

1. Partitioning Clustering
2. Density-Based Clustering
3. Distribution Model-Based Clustering
4. Hierarchical Clustering
5. Fuzzy Clustering

Partitioning Clustering

It is a type of clustering that divides the data into non-hierarchical groups. It is also known as the centroid-
based method. The most common example of partitioning clustering is the K-Means Clustering
algorithm.

In this type, the dataset is divided into a set of k groups, where K is used to define the number of pre-
defined groups. The cluster center is created in such a way that the distance between the data points of one
cluster is minimum as compared to another cluster centroid.

Page
54
lOMoAR cPSD| 31245499

ARTIFICIAL INTELLIGENCE UNIT I

K-Means Clustering Algorithm

K-Means Clustering is an unsupervised learning algorithm that is used to solve clustering problems in
machine learning or data science. In this topic, we will learn what is K-means clustering algorithm, how the
algorithm works, along with the Python implementation of k-means clustering.

What is K-Means Algorithm?

K-Means Clustering is an Unsupervised Learning algorithm, which groups the unlabeled dataset into
different clusters. Here K defines the number of pre-defined clusters that need to be created in the process,
as if K=2, there will be two clusters, and for K=3, there will be three clusters, and so on.

It is an iterative algorithm that divides the unlabeled dataset into k different clusters in such a way that each
dataset belongs to only one group that has similar properties.

It allows us to cluster the data into different groups and is a convenient way to discover the categories of
groups in the unlabeled dataset on its own without the need for any training.

It is a centroid-based algorithm, where each cluster is associated with a centroid. The main aim of this
algorithm is to minimize the sum of distances between the data point and their corresponding clusters.

Page
55
lOMoAR cPSD| 31245499

ARTIFICIAL INTELLIGENCE UNIT I

The algorithm takes the unlabeled dataset as input, divides the dataset into k-number of clusters, and
repeats the process until it does not find the best clusters. The value of k should be predetermined in this
algorithm.

The k-means clustering algorithm mainly performs two tasks:

o Determines the best value for K center points or centroids by an iterative process.

o Assigns each data point to its closest k-center. Those data points which are near to the particular k-
center, create a cluster.

Hence each cluster has datapoints with some commonalities, and it is away from other clusters.

The below diagram explains the working of the K-means Clustering Algorithm:

How does the K-Means Algorithm Work?

The working of the K-Means algorithm is explained in the below steps:

Step-1: Select the number K to decide the number of clusters.

Step-2: Select random K points or centroids. (It can be other from the input dataset).

Step-3: Assign each data point to their closest centroid, which will form the predefined K clusters.

Step-4: Calculate the variance and place a new centroid of each cluster.

Page
56
lOMoAR cPSD| 31245499

ARTIFICIAL INTELLIGENCE UNIT I

Step-5: Repeat the third steps, which means reassign each datapoint to the new closest centroid of each
cluster.

Step-6: If any reassignment occurs, then go to step-4 else go to FINISH.

Step-7: The model is ready.

Python Implementation of the K-Means Clustering Algorithm

Before implementation, let's understand what type of problem we will solve here. So, we have a dataset
of Mall_Customers, which is the data of customers who visit the mall and spend there.

In the given dataset, we have Customer_Id, Gender, Age, Annual Income ($), and Spending
Score (which is the calculated value of how much a customer has spent in the mall, the more the value, the
more he has spent). From this dataset, we need to calculate some patterns, as it is an unsupervised method,
so we don't know what to calculate exactly.

The steps to be followed for the implementation are given below:

 Data pre-processing

 Finding the optimal number of clusters using the elbow method

 Training the K-Means algorithm on the training data set

 Visualizing the clusters

1. Data Pre-Processing. Import the libraries, datasets, and extract the


independent variables.

# importing libraries

import numpy as nm

import matplotlib.pyplot as mtp

import pandas as pd
Page
57
lOMoAR cPSD| 31245499

ARTIFICIAL INTELLIGENCE UNIT I

# Importing the dataset

dataset = pd.read_csv('Mall_Customers_data.csv')

x = dataset.iloc[:, [3, 4]].values // As we can see, we are extracting only 3rd and 4th feature. It is
because we need a 2d plot to visualize the model, and some features are not required, such as customer_id.

From the above dataset, we need to find some patterns in it.

2. Find the optimal number of clusters using the elbow method. Here’s the
code you use:

#finding optimal number of clusters using the elbow method

Page
58
lOMoAR cPSD| 31245499

ARTIFICIAL INTELLIGENCE UNIT I

from sklearn.cluster import KMeans

wcss_list= [] #Initializing the list for the values of WCSS

#Using for loop for iterations from 1 to 10.

for i in range(1, 11):

kmeans = KMeans(n_clusters=i, init='k-means++', random_state= 42)

kmeans.fit(x)

wcss_list.append(kmeans.inertia_)

mtp.plot(range(1, 11), wcss_list)

mtp.title('The Elobw Method Graph')

mtp.xlabel('Number of clusters(k)')

mtp.ylabel('wcss_list')

mtp.show()

As we can see in the above code, we have used the KMeans class of sklearn. cluster library to form the
clusters.

Next, we have created the wcss_list variable to initialize an empty list, which is used to contain the value
of wcss computed for different values of k ranging from 1 to 10.

After that, we have initialized the for loop for the iteration on a different value of k ranging from 1 to 10;
since for loop in Python, exclude the outbound limit, so it is taken as 11 to include 10th value.

The rest part of the code is similar as we did in earlier topics, as we have fitted the model on a matrix of
features and then plotted the graph between the number of clusters and WCSS.

Page
59
lOMoAR cPSD| 31245499

ARTIFICIAL INTELLIGENCE UNIT I

Output: After executing the above code, we will get the below output:

From the above plot, we can see the elbow point is at 5. So the number of clusters here will be 5.

3. Train the K-means algorithm on the training dataset. However, instead of using i, use 5, because
there are 5 clusters that need to be formed. Here’s the code:

#training the K-means model on a dataset

kmeans = KMeans(n_clusters=5, init='k-means++', random_state= 42)

y_predict= kmeans.fit_predict(x)

The first line is the same as above for creating the object of KMeans class. In the second line of code, we
have created the dependent variable y_predict to train the model. By executing the above lines of code, we
will get the y_predict variable.

4. Visualize the Clusters. Since this model has five clusters, we need to
visualize each one.

#visulaizing the clusters

mtp.scatter(x[y_predict == 0, 0], x[y_predict == 0, 1], s = 100, c = 'blue', label =


'Cluster 1') #for first cluster

Page
60
lOMoAR cPSD| 31245499

ARTIFICIAL INTELLIGENCE UNIT I

mtp.scatter(x[y_predict == 1, 0], x[y_predict == 1, 1], s = 100, c = 'green', label =


'Cluster 2') #for second cluster

mtp.scatter(x[y_predict== 2, 0], x[y_predict == 2, 1], s = 100, c = 'red', label =


'Cluster 3') #for third cluster

mtp.scatter(x[y_predict == 3, 0], x[y_predict == 3, 1], s = 100, c = 'cyan', label =


'Cluster 4') #for fourth cluster

mtp.scatter(x[y_predict == 4, 0], x[y_predict == 4, 1], s = 100, c = 'magenta', label


= 'Cluster 5') #for fifth cluster

mtp.scatter(kmeans.cluster_centers_[:, 0], kmeans.cluster_centers_[:, 1], s = 300, c


= 'yellow', label = 'Centroid')

mtp.title('Clusters of customers')

mtp.xlabel('Annual Income (k$)')

mtp.ylabel('Spending Score (1-100)')

mtp.legend()

mtp.show()

Output:

Page
61
lOMoAR cPSD| 31245499

ARTIFICIAL INTELLIGENCE UNIT I

The output image clearly shows the five different clusters with different colors. The clusters are formed
between two parameters of the dataset: Annual income of customer and Spending. We can change the
colors and labels as per the requirement or choice. We can also observe some points from the above
patterns, which are given below:

o Cluster1 shows the customers with average salary and average spending so we can categorize these
customers as

o Cluster2 shows the customer has a high income but low spending, so we can categorize them
as careful.

o Cluster3 shows the low income and also low spending so they can be categorized as sensible.

o Cluster4 shows the customers with low income with very high spending so they can be categorized
as careless.

o Cluster5 shows the customers with high income and high spending so they can be categorized as
target, and these customers can be the most profitable customers for the mall owner.

Advantages of Unsupervised Learning

o Unsupervised learning is used for more complex tasks as compared to supervised learning because,
in unsupervised learning, we don't have labeled input data.
o Unsupervised learning is preferable as it is easy to get unlabeled data in comparison to labeled data.

Page
62
lOMoAR cPSD| 31245499

ARTIFICIAL INTELLIGENCE UNIT I

Disadvantages of Unsupervised Learning

o Unsupervised learning is intrinsically more difficult than supervised learning as it does not have
corresponding output.
o The result of the unsupervised learning algorithm might be less accurate as input data is not labeled,
and algorithms do not know the exact output in advance.

3) Reinforcement Learning
Reinforcement learning is a feedback-based learning method, in which a learning agent gets a reward for
each right action and gets a penalty for each wrong action. The agent learns automatically with these
feedbacks and improves its performance. In reinforcement learning, the agent interacts with the
environment and explores it. The goal of an agent is to get the most reward points, and hence, it improves
its
performance.
• Model-based reinforcement learning: as it sounds, has an agent trying to understand its environment
and creating a model for it based on its interactions with this environment. In such a system, preferences
take priority over the consequences of the actions i.e. the greedy agent will always try to perform an action
that will get the maximum reward irrespective of what that action may cause.

• Model-free reinforcement learning: On the other hand, model-free algorithms seek to learn the
consequences of their actions through experience via algorithms such as Policy Gradient, Q-Learning, etc.
In other words, such an algorithm will carry out an action multiple times and will adjust the policy (the
strategy behind its actions) for optimal rewards, based on the outcomes.
Some popular model-free reinforcement learning algorithms include Q-Learning, SARSA, and Deep
Reinforcement Learning.

Decision Tree Classifier Building in Scikit-learn

Scikit-learn is an open-source library in Python that helps us implement machine learning models. This
library provides a collection of handy tools like regression and classification to simplify complex machine
learning problems

Importing Required Libraries

Let's first load the required libraries.


# Load libraries

Page
63
lOMoAR cPSD| 31245499

ARTIFICIAL INTELLIGENCE UNIT I


import pandas as pd
from sklearn.tree import DecisionTreeClassifier # Import Decision Tree Classifier
from sklearn.model_selection import train_test_split # Import train_test_split function
from sklearn import metrics #Import scikit-learn metrics module
for accuracy calculation

Code Explanation:

This code imports necessary libraries for building a decision tree classifier model.
 pandas is a library used for data manipulation and analysis.
 DecisionTreeClassifier is a class from the sklearn.tree module that is used to build a
decision tree classifier model.
 train_test_split is a function from the sklearn.model_selection module that is used to
split the dataset into training and testing sets.
 metrics is a module from the sklearn library that provides various metrics for evaluating
the performance of a machine learning model.
By importing these libraries, the user can use their functions and classes to build and evaluate a
decision tree classifier model.

Loading Data
Let's first load the required Pima Indian Diabetes dataset using pandas' read CSV function. You can
download the Kaggle data set to follow along.

col_names = ['pregnant', 'glucose', 'bp', 'skin', 'insulin', 'bmi', 'pedigree', 'age', 'label']

# load dataset

pima = pd.read_csv("diabetes.csv", header=None, names=col_names)

Code Explanation:
This code creates a list of column names called col_names which will be used to label the columns of a
dataset. Then, it loads a dataset called "diabetes.csv" into a Pandas DataFrame called pima. The
header=None argument specifies that the dataset does not have a header row, and the names=col_names
argument assigns the column names from the col_names list to the DataFrame.

pima.head()

The pima.head() function is used to display the first few rows of the dataset pima. This is useful for getting
a quick overview of the data and checking if it has been loaded correctly. By default, the head() function
displays the first 5 rows of the dataset, but you can specify a different number of rows to display by passing
an integer argument to the function.

Page
64
lOMoAR cPSD| 31245499

ARTIFICIAL INTELLIGENCE UNIT I

pregnant glucose bp skin insulin bmi pedigree age label

0 6 148 72 35 0 33.6 0.627 50 1

1 1 85 66 29 0 26.6 0.351 31 0

2 8 183 64 0 0 23.3 0.672 32 1

3 1 89 66 23 94 28.1 0.167 21 0

4 0 137 40 35 168 43.1 2.288 33 1

Feature Selection

Here, you need to divide given columns into two types of variables dependent (or target variable) and
independent variable (or feature variables).

#split dataset in features and target variable

feature_cols = ['pregnant', 'insulin', 'bmi', 'age','glucose','bp','pedigree']

X = pima[feature_cols] # Features

y = pima.label # Target variable

Code Explanation:
This code is written in Python, and it is used to split a dataset into features and target variable.
The first line defines a list of feature columns that will be used to create the feature matrix. The list contains the names of
the columns that will be used as features, which are 'pregnant', 'insulin', 'bmi', 'age', 'glucose', 'bp', and 'pedigree'.
The second line creates a feature matrix X by selecting the columns specified in the feature_cols list from the pima dataset.
The feature matrix X will contain the values of the selected columns for each observation in the dataset.

Page
65
lOMoAR cPSD| 31245499

ARTIFICIAL INTELLIGENCE UNIT I


The third line creates a target variable y by selecting the 'label' column from the pima dataset. The target variable y will
contain the values of the 'label' column for each observation in the dataset.
Overall, this code is used to prepare the data for machine learning by separating the features and target variable into
separate variables.

Splitting Data

To understand model performance, dividing the dataset into a training set and a test
set is a good strategy.

Let's split the dataset by using the function train_test_split(). You need to pass three
parameters features; target, and test_set size.

# Split dataset into training set and test set

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=1) # 70% training


and 30% test

Code Explanation:

This code uses the train_test_split function from the sklearn.model_selection module to split a dataset into
a training set and a test set.

The X and y variables represent the features and target variable of the dataset, respectively. The test_size
parameter is set to 0.3, which means that 30% of the data will be used for testing and 70% will be used for
training. The random_state parameter is set to 1, which ensures that the same random split is generated
each time the code is run.
The function returns four arrays: X_train, X_test, y_train, and y_test. X_train and y_train represent the
training set, while X_test and y_test represent the test set. These arrays can be used to train and evaluate a
machine learning model.

Building Decision Tree Model

Let's create a decision tree model using Scikit-learn.

# Create Decision Tree classifer object

clf = DecisionTreeClassifier()

# Train Decision Tree Classifer

clf = clf.fit(X_train,y_train)
Page
66
lOMoAR cPSD| 31245499

ARTIFICIAL INTELLIGENCE UNIT I

#Predict the response for test dataset

y_pred = clf.predict(X_test)

Code Explanation:

This code is written in Python and it creates a decision tree classifier object using the
DecisionTreeClassifier() function. Then, it trains the classifier using the fit() method with the training data
X_train and y_train. Finally, it uses the trained classifier to predict the response for the test dataset X_test
and stores the predictions in y_pred.

Evaluating the Model


Let's estimate how accurately the classifier or model can predict the type of
cultivars.

Accuracy can be computed by comparing actual test set values and predicted values.

# Model Accuracy, how often is the classifier correct?

print("Accuracy:",metrics.accuracy_score(y_test, y_pred))

Code Explanation:

The metrics.accuracy_score() function is used to calculate the accuracy of a classification model. It takes
two arguments: y_test and y_pred. y_test is the true labels of the test set, and y_pred is the predicted labels
of the test set.
The print() function is used to display the accuracy score on the console. The output will be a string that
says "Accuracy:" followed by the actual accuracy score.

Accuracy: 0.6753246753246753

This code snippet is simply displaying the accuracy score of a model. The value 0.6753246753246753
represents the accuracy score, which is a metric used to evaluate the performance of a machine learning
model. The higher the accuracy score, the better the model is at making correct predictions. There is no
specific programming language mentioned in this code snippet, as it is simply displaying a numerical
value.
We got a classification rate of 67.53%, which is considered as good accuracy. You can improve this
accuracy by tuning the parameters in the decision tree algorithm.

Page
67
lOMoAR cPSD| 31245499

ARTIFICIAL INTELLIGENCE UNIT I

Visualizing Decision Trees

You can use Scikit-learn's export_graphviz function for display the tree within a Jupyter notebook. For
plotting the tree, you also need to install graphviz and pydotplus.

pip install graphviz

pip install pydotplus

The export_graphviz function converts the decision tree classifier into a dot file, and pydotplus converts
this dot file to png or displayable form on Jupyter.

from sklearn.tree import export_graphviz

from sklearn.externals.six import StringIO

from IPython.display import Image

import pydotplus

dot_data = StringIO()

export_graphviz(clf, out_file=dot_data,

filled=True, rounded=True,

special_characters=True,feature_names = feature_cols,class_names=['0','1'])

graph = pydotplus.graph_from_dot_data(dot_data.getvalue())

graph.write_png('diabetes.png')

Image(graph.create_png())

Code Explanation:

This code snippet is used to visualize a decision tree model created using scikit-learn's clf object.
First, the necessary libraries are imported: export_graphviz from sklearn.tree, StringIO from
sklearn.externals.six, Image from IPython.display, and pydotplus.

Page
68
lOMoAR cPSD| 31245499

ARTIFICIAL INTELLIGENCE UNIT I

Then, a StringIO object is created to store the dot data generated by export_graphviz. The export_graphviz
function is called with the clf object as the first argument and various parameters to customize the
appearance of the tree. The dot data is then written to the StringIO object.
Next, pydotplus is used to create a graph from the dot data stored in the StringIO object. The resulting
graph is saved as a PNG file named "diabetes.png". Finally, the graph is displayed using Image from
IPython.display.
Overall, this code generates a visual representation of a decision tree model, which can be useful for
understanding how the model makes predictions and identifying areas for improvement.
Now Optimizing Decision Tree Performance….?

What is an Expert System?

An expert system is a computer program that is designed to solve complex problems and to provide
decision-making ability like a human expert. It performs this by extracting knowledge from its knowledge
base using the reasoning and inference rules according to the user queries.

The expert system is a part of AI, and the first ES was developed in the year 1970, which was the first
successful approach of artificial intelligence. It solves the most complex issue as an expert by extracting
the knowledge stored in its knowledge base. The system helps in decision making for complex problems
using both facts and heuristics like a human expert. It is called so because it contains the expert
knowledge of a specific domain and can solve any complex problem of that particular domain. These
systems are designed for a specific domain, such as medicine, science, etc.

The performance of an expert system is based on the expert's knowledge stored in its knowledge base. The
more knowledge stored in the KB, the more that system improves its performance. One of the common
examples of an ES is a suggestion of spelling errors while typing in the Google search box.

Below is the block diagram that represents the working of an expert system:

Page
69
lOMoAR cPSD| 31245499

ARTIFICIAL INTELLIGENCE UNIT I

Note: It is important to remember that an expert system is not used to replace the human experts; instead,
it is used to assist the human in making a complex decision. These systems do not have human capabilities
of thinking and work on the basis of the knowledge base of the particular domain.

Below are some popular examples of the Expert System:

o DENDRAL: It was an artificial intelligence project that was made as a chemical analysis expert
system. It was used in organic chemistry to detect unknown organic molecules with the help of their
mass spectra and knowledge base of chemistry.

o MYCIN: It was one of the earliest backward chaining expert systems that was designed to find the
bacteria causing infections like bacteraemia and meningitis. It was also used for the
recommendation of antibiotics and the diagnosis of blood clotting diseases.

o PXDES: It is an expert system that is used to determine the type and level of lung cancer. To
determine the disease, it takes a picture from the upper body, which looks like the shadow. This
shadow identifies the type and degree of harm.

o CaDeT: The CaDet expert system is a diagnostic support system that can detect cancer at early
stages.

Characteristics of Expert System

o High Performance: The expert system provides high performance for solving any type of complex
problem of a specific domain with high efficiency and accuracy.

o Understandable: It responds in a way that can be easily understandable by the user. It can take
input in human language and provides the output in the same way.

Page
70
lOMoAR cPSD| 31245499

ARTIFICIAL INTELLIGENCE UNIT I

o Reliable: It is much reliable for generating an efficient and accurate output.

o Highly responsive: ES provides the result for any complex query within a very short period of
time.

Components of Expert System

An expert system mainly consists of three components:

o User Interface

o Inference Engine

o Knowledge Base

1. User Interface

With the help of a user interface, the expert system interacts with the user, takes queries as an input in a
readable format, and passes it to the inference engine. After getting the response from the inference engine,
it displays the output to the user. In other words, it is an interface that helps a non-expert user to
communicate with the expert system to find a solution.

Page
71
lOMoAR cPSD| 31245499

ARTIFICIAL INTELLIGENCE UNIT I

2. Inference Engine (Rules of Engine)

o The inference engine is known as the brain of the expert system as it is the main processing unit of
the system. It applies inference rules to the knowledge base to derive a conclusion or deduce new
information. It helps in deriving an error-free solution of queries asked by the user.

o With the help of an inference engine, the system extracts the knowledge from the knowledge base.

Inference engine uses the below modes to derive the solutions:

o Forward Chaining: Forward chaining is a data-driven strategy where the system starts with the
available data (initial facts) and applies rules to derive new conclusions or facts. It proceeds in a
step-by-step manner, applying rules iteratively until no further conclusions can be drawn.
o Process:
o Begin with the known facts.
o Apply applicable rules to these facts.
o Generate new facts or conclusions.
o Continue this process until no more new facts can be inferred.
o Use Cases: Commonly used in systems where data or facts are readily available and the goal is to
derive a conclusion or reach a goal state.

o Backward Chaining: Backward Chaining:


o Backward chaining is a goal-driven strategy where the system starts with a goal or conclusion and
works backward through the rules to find evidence (facts) that support the goal. It determines what
must be true in order for a given goal to be achieved.

Page
72
lOMoAR cPSD| 31245499

ARTIFICIAL INTELLIGENCE UNIT I

o Process:
o Start with the goal or conclusion to be proven.
o Identify rules that could lead to this goal.
o Check if the conditions of these rules are satisfied using available facts or by further backward
chaining.
o If necessary, continue this process recursively until either a solution is found or it is determined that
the goal cannot be achieved.
o Use Cases: Often used in diagnostic systems, where the goal is to identify the causes of observed
symptoms or outcomes.

3. Knowledge Base

o The knowledgebase is a type of storage that stores knowledge acquired from the different experts of
the particular domain. It is considered as big storage of knowledge. The more the knowledge base,
the more precise will be the Expert System.

o It is similar to a database that contains information and rules of a particular domain or subject.

o One can also view the knowledge base as collections of objects and their attributes. Such as a Lion
is an object and its attributes are it is a mammal, it is not a domestic animal, etc.

Components of Knowledge Base

o Factual Knowledge: The knowledge which is based on facts and accepted by knowledge engineers
comes under factual knowledge.

Page
73
lOMoAR cPSD| 31245499

ARTIFICIAL INTELLIGENCE UNIT I

o Heuristic Knowledge: This knowledge is based on practice, the ability to guess, evaluation, and
experiences.

Knowledge Representation: It is used to formalize the knowledge stored in the knowledge base using the
If-else rules.

Knowledge Acquisitions: It is the process of extracting, organizing, and structuring the domain
knowledge, specifying the rules to acquire the knowledge from various experts, and store that knowledge
into the knowledge base.

Development of Expert System

Here, we will explain the working of an expert system by taking an example of MYCIN ES. Below are
some steps to build an MYCIN:

o Firstly, ES should be fed with expert knowledge. In the case of MYCIN, human experts specialized
in the medical field of bacterial infection, provide information about the causes, symptoms, and
other knowledge in that domain.

o The KB of the MYCIN is updated successfully. In order to test it, the doctor provides a new
problem to it. The problem is to identify the presence of the bacteria by inputting the details of a
patient, including the symptoms, current condition, and medical history.

o The ES will need a questionnaire to be filled by the patient to know the general information about
the patient, such as gender, age, etc.

o Now the system has collected all the information, so it will find the solution for the problem by
applying if-then rules using the inference engine and using the facts stored within the KB.

o In the end, it will provide a response to the patient by using the user interface.

Participants in the development of Expert System

There are three primary participants in the building of Expert System:

1. Expert: The success of an ES much depends on the knowledge provided by human experts. These
experts are those persons who are specialized in that specific domain.
2. Knowledge Engineer: Knowledge engineer is the person who gathers the knowledge from the
domain experts and then codifies that knowledge to the system according to the formalism.
Page
74
lOMoAR cPSD| 31245499

ARTIFICIAL INTELLIGENCE UNIT I

3. End-User: This is a particular person or a group of people who may not be experts, and working on
the expert system needs the solution or advice for his queries, which are complex.

Page
75

You might also like