0% found this document useful (0 votes)

3 views64 pages

Introduction - Final

The document outlines the objectives and outcomes of a Machine Learning course, covering basic concepts, supervised and unsupervised algorithms, ensemble techniques, and dimensionality reduction. It discusses the applications of machine learning in various fields, the types of learning, and the challenges faced in the domain, such as data labeling and the shortage of experts. Additionally, it provides insights into the steps involved in developing machine learning applications, including data collection and preprocessing techniques.

Uploaded by

neha.surti

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views64 pages

Introduction - Final

Uploaded by

neha.surti

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 64

Objectives

 To introduce the basic concepts and techniques of

Machine Learning
 To introduce various supervised and unsupervised
algorithms
 To introduce various ensemble techniques for
combining ML models.
 To introduce the concept of dimensionality reduction
and its techniques.

August 12, 2025 1

Outcomes

 Identify a Machine Learning technique for the given problem and understand the
concepts of Training Error, Generalization Error, Overfitting and Underfitting.
 Apply Regression and Decision Tree techniques on the given data and examine
the performance of the model
 Compare and Contrast Ensemble approaches for combining multiple Machine
Learning Techniques
 Determine the type of Support Vector Machines variant which can applied on the
given data
 Apply Unsupervised Learning technique on the given data for getting insights
from unlabeled data
 Use Dimensionality Reduction techniques for dealing with data with large
number of attributes

August 12, 2025 2

Syllabus

3
4
5
6
7
8
What is machine learning

 Machine learning is an area in the

computer science which involves teaching
computers to do things naturally by
learning through experience
 A computer program is said to learn from
experience (E) with respect to some class
of task (T) and performance measure (P)

August 12, 2025 9

Features

 Class of task
 Performance measure
 Source of experience
 Example:-Robot navigation in a maze

August 12, 2025 10

Common definitions

 Machine learning is used to parse data,

learn from it and then make a
determination or predi ction about
something in the world
 Machine learning lies at the intersection
of computer science, engineering and
statistics and often appears in other
disciplines

August 12, 2025 11

Components of ML study

Computer Statistics
Science

Engineering
Engineering
Examples of machine learning

 Facebook which continuously notices the

friends in the list, profiles often visited, your
interests, workplace, groups you are in and
so on. Based on the information retrieved ,
facebook gives you friend suggestions
 Consider that you purchased an item from
amazon. If you purchased a mobile phone
online, then the site from where you
purchased it immediately recommends a
cover for the phone purchased

August 12, 2025 13

How does ML algorithm work?
 Machine learning uses algorithms to find
patterns in data
 then uses a model that recognizes those
patterns to make predictions on new data
Predictions

Training
Model:
algorithm:
Data recognizes
finds the
the pattern
patterns

New Data
Machine learning can be
implemented in the

 Healthcare sector
 Pharmaceutical companies

August 12, 2025 15

Where is machine learning used

 Marketing and sales

 Search engines
 Transportation:- Based on travel history
and pattern of travelling across various
routes , machine learning can help
transportation companies predict
potential problems that could arise on
certain routes and accordingly advise
their customers to opt for a different
route
August 12, 2025 16
Types of machine learning

 Supervised learning:-Suppose you have a

fruit basket and your task is to arrange
the fruit by type
 You can group the fruits based on any
physical character

August 12, 2025 17

 Rule 1:- If the color of the fruit is Red and
size of the fruit is small then the fruit is
cherry
 If the color of the fruit is Red and size of
the fruit is Big then the fruit is apple
 If the color of the fruit is Green and size of
the fruit is small then the fruit is grape
 If the color of the fruit is green and size of
the fruit is Big then the fruit is
watermelon
August 12, 2025 18
Decision Tree Induction

August 12, 2025 20

August 12, 2025 21
Reinforcement Learning

 This learning is similar to supervised learning.

 In the supervised learning the correct target
output values are known for each input
pattern.
 But in some cases, the less information might
be available
 For example the network might be told that its
actual output is only 50% correct.
 Thus here only critic information is available
not the exact information

August 12, 2025 22

 The learning based on this critic information is
called as reinforcement learning and the
feedback sent is called as reinforcement
signal.
 The reinforcement learning is a form of
supervised learning because the network
receives some feedback from its environment.
 The reinforcement signals are processed in
the critic signal generator and the obtained
critic signals are sent to the network for
adjustments of weights.
August 12, 2025 23
 The reinforcement learning is also called
learning with a critic as opposed to
learning with a teacher , which indicates
supervised learning
 Reinforcement learning work very similar
to how you learn by yourself without any
guidance basically through hit and trial.
 When you get something right , you get
reward, you feel happy and you move
ahead
August 12, 2025 24
 When you get something wrong , you get
a penalty . You take a step back and then
you try to avoid incorrect path while
exploring another correct path.
 Example:-Robots equipped with sensors
from to learn their surrounding
environment

August 12, 2025 25

Unsupervised Learning

 K-Means clustering

August 12, 2025 26

Issues in machine learning

 Data labelling:- Today there is a large

amount of data that is unlabelled and
raw. As you know supervised machine
learning works on labelled data.
 Without adequate data labels in the
training dataset , it is not feasible to build
robust maching learning model.
 Companies are putting thousands of man
hours to label the data so that it can be
used for machine learning
August 12, 2025 27
 This is an active area of research where
the labels can be attached to the data as
it is used

August 12, 2025 28

Shortage of experts
 Machine learning is an emerging field and
there are not many experts around the
world. You require experts who can
 1)Understand the wide variety of data
 2) Model the data correctly so as to meet
the desired objectives
 3)Build and manage software and
hardware tools and techniques required
for machine learning

August 12, 2025 29

Obtaining massive training
datasets

 It is difficult to obtain massive training

dataset for various areas of machine
learning
 You may lack historical data and also the
quality of data for the training dataset
matters.
 If the dataset obtained does not represent
a fair sample size then the resultant
machine learning model could be
erroneous
August 12, 2025 30
 If you are trying to build machine learning
model to predict a particular type of
cancer from a given set of symptoms,
lifestyle and blood related parameters.
Then you may require quality data for
thousands of patients that have had that
particular type of cancer and the details
of their symptoms , lifestyle and blood
related parameters.

August 12, 2025 31

HARD TO EXPLAIN PROBLEMS
AND RESULTS

 Complex machine learning models , often

built by experts may not be self
explanatory when used by common
people in the field
 For example you tell a healthy person
that she has an 80% chance of getting a
particular disease then she may require
additional details behind that statement

August 12, 2025 32

Limited possibilities to reuse the
model

 It is difficult to reuse an existing machine

learning model for other uses cases.
Companies have to invest time and
resources to build new model for solving
new use cases

August 12, 2025 33

Steps in developing machine
learning application

 Collect data:-Some of the popular,

publicly available dataset resources are
as follows
 1) Kaggle dataset
 2) Amazon web services
 3) Machine learning repository
 4)Google tensor flow
 5)Microsoft
 6) Open ML
August 12, 2025 34
Prepare the input data

 Once you have the data, you need to

ensure that it is in the right format such
that it can be processed by the chosen
algorithm and computer programs

August 12, 2025 35

Data Preprocessing?

 Data Processing
Processing that involves transformation of raw
data into useful information.

 Why pre-processing is required?

1. Real world data are generally
 incomplete:

 noisy:

August 12, 2025 36

2. Tasks in Data Preprocessing

 Data cleaning
 Fill in missing values, smooth noisy data, identify or
remove outliers
 Data integration
 Integrating data from multiple sources
 Data transformation
 Normalization
 Data reduction
Obtains reduced representation in volume but produces
the same or similar analytical results

August 12, 2025 37

preprocessing

August 12, 2025 38

Data Cleaning

 Data cleaning tasks

 Fill in missing values
 Identify outliers and smooth out noisy
data

August 12, 2025 39

How to Handle Missing
Data?
 Ignore the tuple: usually done when class label is missing
(assuming the tasks in classification
 Fill in the missing value manually: tedious + infeasible?
 Use a global constant to fill in the missing value: e.g.,
“unknown”
 Use the attribute mean or median to fill in the missing value
 Use the most probable value to fill in the missing value:
using techniques like regression , Bayesian
classification ,decision tree, Clustering algorithm

August 12, 2025 40

Example of Weather
Outlook Temperature Humidity W indy Class
sunny hot high false N
sunny hot high true N
overcast hot high false P
rain mild high false P
rain cool normal false P
rain cool normal true N
overcast cool normal true P
sunny mild high false N
sunny cool normal false P
rain mild normal false P
sunny mild normal true P
overcast mild high true P
overcast hot normal false P
rain mild high true N

August 12, 2025 41

How to Handle Noisy Data?
 Binning method:
 first sort data and partition into (equi-depth)

bins
 then one can smooth by bin means, smooth by

bin median, smooth by bin boundaries, etc.

 Clustering
 detect and remove outliers

 Regression

August 12, 2025 42

Binning

 Consider sorted data for example price in

INR
 4,8,9,15,21,21,24,25,26,28,29,34
 N=3
 Bin 1:4,8,9,15
 Bin 2: 21,21,24,25
 Bin 3:26,28,29,34

August 12, 2025 43

Smooth by bin means

 Replace each value of bin with its mean

value
 Bin 1:- 9,9,9,9
 Bin 2:-23,23,23,23
 Bin 3:-29,29,29,29

August 12, 2025 44

Smoothing by bin median

 Bin 1:-8.5,8.5,8.5,8.5
 Bin 2:-22.5,22.5,22.5,22.5
 Bin 3:-28.5,28.5,28.5,28.5

August 12, 2025 45

Smoothing by bin boundaries

 Bin 1:- 4,4,4,15

 Bin 2:-21,21,25,25
 Bin 3:- 26,26,26,34

August 12, 2025 46

Data Integration

 Carl’s Coefficient Measure

 Covariance

August 12, 2025 47

August 12, 2025 48
August 12, 2025 49
August 12, 2025 50
Data Reduction

 Dimension reduction technique

August 12, 2025 51

Example of Decision Tree Induction

Initial attribute set:

{A1, A2, A3, A4, A5, A6}
A4 ?

A1? A6?

Class 1 Class 2 Class 1 Class 2

> Reduced attribute set: {A1, A4, A6}

August 12, 2025 52

Numerosity Reduction

Numerosity reduction 40
technique refers to 35
reducing the volume of
data by choosing smaller30
forms for data 25
representation.
20
1. Histograms
A popular data 15
reduction technique 10
Divide data into buckets
5
Range of bucket is

called as width. 0
10000 30000 50000 70000 90000
August 12, 2025 53
Histogram

August 12, 2025 54

August 12, 2025 55
histogram

 D=[1,2,3,4,2,2,3,3,3,3,1,1,1,1,1,4,4,5,5,5,6,6,6,7,
7,7,1]

August 12, 2025 56

histogram

 D=[1,2,3,4,2,2,3,3,3,3,1,1,1,1,1,4,4,5,5,5,6,6,6,7,7,7,
1]
 1:-7 times
 2:-3
 3:-5
 4:-3 times
 5:-3
 6:3
 7:-3
August 12, 2025 57
Data Transformation

 Normalization

August 12, 2025 58

Z-score v' 
v  meanA
stand _ devA

 Sample data [10,20,30]

 Mean=20
 Std dev=square root of variance
 Variance

59
Z-score v' 
v  meanA
stand _ devA

 Sample data [10,20,30]

 Mean=20
 Std dev=square root of variance
 Variance

 =66.66
 Std dev=8.16
 V1=-1.22,0,1.22
60
Analyze the input data

 You need to ensure that examples are

complete (there are no missing values)

August 12, 2025 61

Train the algorithm
Test the algorithm

August 12, 2025 Data Mining: Concepts and Techniques 63

Use the algorithm

 You spent a lot of time collecting and

cleaning the data and then building and
testing the model

August 12, 2025 64

Periodic revisit

 You should periodically review the result

that the model is producing and evaluate
if there are opportunities for improving it
in light of new data. You may carry out
minor adjustments to the model or may
retrain it with latest data to fine tune it

August 12, 2025 65

Clark The Penguin Dicionary of Geography
No ratings yet
Clark The Penguin Dicionary of Geography
472 pages
MBA Dissertation - Final-University of Cumbria
No ratings yet
MBA Dissertation - Final-University of Cumbria
77 pages
Fourth Quarter Test in Science 6 23-24
No ratings yet
Fourth Quarter Test in Science 6 23-24
5 pages
Machine Learning Notes
91% (11)
Machine Learning Notes
19 pages
PTM Remarks May Class-6-D 3
No ratings yet
PTM Remarks May Class-6-D 3
6 pages
Quality Assurance and Quality Control (QA /QC)
No ratings yet
Quality Assurance and Quality Control (QA /QC)
3 pages
Refrigerants
No ratings yet
Refrigerants
19 pages
0830 Warning Codes
No ratings yet
0830 Warning Codes
10 pages
Exterior Render Settings (Vray 3.4 For Sketchup)
No ratings yet
Exterior Render Settings (Vray 3.4 For Sketchup)
14 pages
Hazards and Risk Identification and Management
No ratings yet
Hazards and Risk Identification and Management
2 pages
Introduction To Applied Cryptography Syllabus
No ratings yet
Introduction To Applied Cryptography Syllabus
4 pages
SƏNƏD
No ratings yet
SƏNƏD
4 pages
Guided PYPx Student Planner 4
No ratings yet
Guided PYPx Student Planner 4
31 pages
Six Sigma Method and 5s Method
No ratings yet
Six Sigma Method and 5s Method
12 pages
Syllabus Arch 353 Sec Sem.2024-2025
No ratings yet
Syllabus Arch 353 Sec Sem.2024-2025
4 pages
ED-Course Plan 2024 EEE
No ratings yet
ED-Course Plan 2024 EEE
6 pages
Evr Controller
No ratings yet
Evr Controller
8 pages
02 The Relief
No ratings yet
02 The Relief
2 pages
How To Be An (Not Aggressive) : Assertive
No ratings yet
How To Be An (Not Aggressive) : Assertive
340 pages
Lecture 1
No ratings yet
Lecture 1
65 pages
Chapter1 Machine Learning
No ratings yet
Chapter1 Machine Learning
26 pages
Chapter 7 - Artificial Intelligence Application
No ratings yet
Chapter 7 - Artificial Intelligence Application
29 pages
Essentials of Organizational Behavior 14th Edition Robbins Test Bank Instant Download
100% (3)
Essentials of Organizational Behavior 14th Edition Robbins Test Bank Instant Download
51 pages
Effectiveness of Olive Oil Massage On Fatigue Among The Patients Undergoing Haemodialysis
No ratings yet
Effectiveness of Olive Oil Massage On Fatigue Among The Patients Undergoing Haemodialysis
5 pages
Lecture 1
No ratings yet
Lecture 1
24 pages
Lec 7 - 8 - Machine Learning Introduction
No ratings yet
Lec 7 - 8 - Machine Learning Introduction
55 pages
Toray
No ratings yet
Toray
60 pages
Class10-Introduction To ML
No ratings yet
Class10-Introduction To ML
32 pages
Day 2 Part 1
No ratings yet
Day 2 Part 1
52 pages
2007 02 17 GENV Cofimvaba Landfill Site Phase 2
No ratings yet
2007 02 17 GENV Cofimvaba Landfill Site Phase 2
42 pages
ML Overview
No ratings yet
ML Overview
26 pages
An Analysis On Reflections by Angela Carter
100% (1)
An Analysis On Reflections by Angela Carter
2 pages
Article Text-1112-1-10
No ratings yet
Article Text-1112-1-10
9 pages
Session 8 - Machine Learning Techniques
No ratings yet
Session 8 - Machine Learning Techniques
48 pages
Scaffold Erection NC2 Cert
No ratings yet
Scaffold Erection NC2 Cert
1 page
ML Cahp 1
No ratings yet
ML Cahp 1
35 pages
4699-Chu-201-Xme-201 Itp Weld R
No ratings yet
4699-Chu-201-Xme-201 Itp Weld R
5 pages
Chapter-1 ML Intro
No ratings yet
Chapter-1 ML Intro
36 pages
2024 SCU ML 1 2 Introduction
No ratings yet
2024 SCU ML 1 2 Introduction
35 pages
Chapter 1 Introduction To Machine Learning
No ratings yet
Chapter 1 Introduction To Machine Learning
29 pages
Introduction To ML
No ratings yet
Introduction To ML
48 pages
Ch01 ICS422 02
No ratings yet
Ch01 ICS422 02
39 pages
3 - InnovatiCS - Introduction To CRISP-DM
No ratings yet
3 - InnovatiCS - Introduction To CRISP-DM
35 pages
Chapter 5 AI
No ratings yet
Chapter 5 AI
40 pages
Well Logging Data Acquisition and Applications Serra Oberto Serra Download
No ratings yet
Well Logging Data Acquisition and Applications Serra Oberto Serra Download
39 pages
Tolerances For Diecastings Din 1688
No ratings yet
Tolerances For Diecastings Din 1688
5 pages
Basic Concepts of Machine Learning for Beginners
No ratings yet
Basic Concepts of Machine Learning for Beginners
102 pages
01 Introduction Overview
No ratings yet
01 Introduction Overview
43 pages
ML MU Unit 1 Introduction To MLPDF 2025 02 07 10 53 02
No ratings yet
ML MU Unit 1 Introduction To MLPDF 2025 02 07 10 53 02
49 pages
Asset-V1 MKAU+SEng9032+DEV 01+type@asset+block@ChapOne
No ratings yet
Asset-V1 MKAU+SEng9032+DEV 01+type@asset+block@ChapOne
29 pages
1 - Machine Learning Overview
No ratings yet
1 - Machine Learning Overview
56 pages
Unit 1
No ratings yet
Unit 1
92 pages
Unit 1
No ratings yet
Unit 1
93 pages
ML - Lecture - 1 Introduction To ML
No ratings yet
ML - Lecture - 1 Introduction To ML
29 pages
Machine Learning
No ratings yet
Machine Learning
57 pages
01 LecIntro
No ratings yet
01 LecIntro
23 pages
Unit I MACHINE LEARNING
No ratings yet
Unit I MACHINE LEARNING
87 pages
Unit I
No ratings yet
Unit I
132 pages
ML Unit1
No ratings yet
ML Unit1
25 pages
Introduction To ML Unit-1
No ratings yet
Introduction To ML Unit-1
90 pages
ML 1
No ratings yet
ML 1
79 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
10 pages
APS1070 Lecture (3) Slides
No ratings yet
APS1070 Lecture (3) Slides
70 pages
Math One Revision Booklet
No ratings yet
Math One Revision Booklet
121 pages
Lecture - 1 Introduction To ML
No ratings yet
Lecture - 1 Introduction To ML
38 pages
01 - Introduction
No ratings yet
01 - Introduction
35 pages
ML_unit1
No ratings yet
ML_unit1
6 pages
Unit-1 MLT
No ratings yet
Unit-1 MLT
51 pages
恢复的关系：中美教育交流的趋势，1978 1984年
No ratings yet
恢复的关系：中美教育交流的趋势，1978 1984年
287 pages
Module 1 ML
No ratings yet
Module 1 ML
51 pages
01 Introduction
No ratings yet
01 Introduction
28 pages
Lecture - 2 Classification (Machine Learning Basic and KNN)
No ratings yet
Lecture - 2 Classification (Machine Learning Basic and KNN)
90 pages
Lecture 2
No ratings yet
Lecture 2
36 pages
Machine Learning
No ratings yet
Machine Learning
74 pages
ML_Module_I
No ratings yet
ML_Module_I
71 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
24 pages
Unit 1 ML
No ratings yet
Unit 1 ML
70 pages
ML Full Slides Final
No ratings yet
ML Full Slides Final
458 pages
LM #01-Introduction To ML
No ratings yet
LM #01-Introduction To ML
33 pages
Module2 ch2
No ratings yet
Module2 ch2
36 pages
AI Unit 1
No ratings yet
AI Unit 1
30 pages
Introduction To ML
100% (1)
Introduction To ML
39 pages
Lecture 1.2 Introduction To Machine Learning
No ratings yet
Lecture 1.2 Introduction To Machine Learning
31 pages
Machine Learning Unit - 1
No ratings yet
Machine Learning Unit - 1
154 pages
Lect3 Machine Learning
No ratings yet
Lect3 Machine Learning
27 pages
Lecture01 Introduction To Machine Learning (Chapter1)
No ratings yet
Lecture01 Introduction To Machine Learning (Chapter1)
64 pages
ML
No ratings yet
ML
19 pages
Lecture 2 Unit 1
No ratings yet
Lecture 2 Unit 1
60 pages
ML - Full Slides Srikanth Allamshatty
No ratings yet
ML - Full Slides Srikanth Allamshatty
369 pages

Introduction - Final

Uploaded by

Introduction - Final

Uploaded by

Objectives

 To introduce the basic concepts and techniques of

August 12, 2025 1

August 12, 2025 2

 Machine learning is an area in the

August 12, 2025 9

August 12, 2025 10

 Machine learning is used to parse data,

August 12, 2025 11

 Facebook which continuously notices the

August 12, 2025 13

August 12, 2025 15

 Marketing and sales

 Supervised learning:-Suppose you have a

August 12, 2025 17

August 12, 2025 20

 This learning is similar to supervised learning.

August 12, 2025 22

August 12, 2025 25

August 12, 2025 26

 Data labelling:- Today there is a large

August 12, 2025 28

August 12, 2025 29

 It is difficult to obtain massive training

August 12, 2025 31

 Complex machine learning models , often

August 12, 2025 32

 It is difficult to reuse an existing machine

August 12, 2025 33

 Collect data:-Some of the popular,

 Once you have the data, you need to

August 12, 2025 35

 Why pre-processing is required?

August 12, 2025 36

August 12, 2025 37

August 12, 2025 38

 Data cleaning tasks

August 12, 2025 39

August 12, 2025 40

August 12, 2025 41

bin median, smooth by bin boundaries, etc.

August 12, 2025 42

 Consider sorted data for example price in

August 12, 2025 43

 Replace each value of bin with its mean

August 12, 2025 44

August 12, 2025 45

 Bin 1:- 4,4,4,15

August 12, 2025 46

 Carl’s Coefficient Measure

August 12, 2025 47

 Dimension reduction technique

August 12, 2025 51

Initial attribute set:

Class 1 Class 2 Class 1 Class 2

> Reduced attribute set: {A1, A4, A6}

August 12, 2025 52

August 12, 2025 54

August 12, 2025 56

August 12, 2025 58

 Sample data [10,20,30]

 Sample data [10,20,30]

 You need to ensure that examples are

August 12, 2025 61

August 12, 2025 Data Mining: Concepts and Techniques 63

 You spent a lot of time collecting and

August 12, 2025 64

 You should periodically review the result

August 12, 2025 65

You might also like