0% found this document useful (0 votes)

2 views

10.Classification2022

Uploaded by

23310020

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views

10.Classification2022

Uploaded by

23310020

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 20

Classification and Prediction

Two forms of data analysis

- extract models describing important data classes or to predict future data trends

⮚ Such analysis can help us with a better understanding of the data at large

⮚ Classification predicts - categorical (discrete, unordered) labels,

⮚ Prediction models - continuous valued functions

⮚ Classification and Prediction have numerous applications

including fraud detection, target marketing, performance prediction, manufacturing,

and medical diagnosis

Classification and Prediction

Classification: task of assigning objects to one of several predefined categories

Eg. Spam mails, MRI scans,
Galaxies

Input Output
Classification
Attribute set → Model → Class labels
X Y
Classification as the task of mapping an input attribute x into its class y.

Input data for classiﬁcation-collection of records/ instance

characterized by tuple (x, y)
x is the attribute set
y is a special attribute - class label/ category/ target attribute
Sample Data Set - Classifying vertebrates

⮚Attributes set are mostly discrete, can also contain continuous features
The class label, on the other hand, must be a discrete attribute

⮚This is a key characteristic distinguishes classiﬁcation from regression, a

predictive modeling task in which y is a continuous attribute

Name Body Temp Skin cover Gives birth Aquatic creature Aerial Has legs Hibernates Class
creature
human warm-blooded hair yes no no yes no mammal
python cold-blooded scales no no no no yes reptile
salmon cold-blooded scales no yes no no no fish
whale warm-blooded hair yes yes no no no mammal
frog cold-blooded none no semi no yes yes amphibian
komodo cold-blooded scales no no no yes no reptile
dragon warm-blooded hair yes no yes yes yes mammal
bat warm-blooded feather no no yes yes no bird
pigeon warm-blooded fur yes no no yes no mammal
cat cold-blooded scales yes yes no no no fish
leopard cold-blooded scales no semi no yes no reptile
shark warm-blooded feather no semi no yes no bird
turtle warm-blooded quills yes no no yes yes mammal
penguin cold-blooded scales no yes no no no fish
porcupine cold-blooded none no semi no yes yes amphibian
Classification vs. Prediction

⮚ Assumption
After data preparation, in a data set where, each record has attributes
X1,…,Xn, and Y

⮚ Goal
Learn a function f:(X1,…,Xn) → Y,
then use function f to predict y for a given input record (x1,…,xn)

⮚ Classification: Y is a discrete attribute, called the class label

Usually a categorical attribute with small domain

⮚ Prediction: Y is a continuous attribute

Called supervised learning, because true labels (Y- values) are known for
initially provided data
What is classification ?

⮚ Classification – task of learning a target function “f ” that maps each attribute

set x to one of the predefined class label y

⮚ The target function informally called as Classification Model

Classification model used for

⮚ Descriptive modeling

⮚Predictive Modeling
Descriptive Modeling

⮚ Classification Model serve an explanatory tool to distinguish between objects

of different classes. The descriptive model summarizes the data given below

Vertebrate Data Set

Name Body Temp Skin Gives Aquatic Aerial Has Hiberna Class
cover birth creature creature legs tes
human warm-blooded hair yes no no yes no mammal
python cold-blooded scales no no no no yes reptile
salmon cold-blooded scales no yes no no no fish
whale warm-blooded hair yes yes no no no mammal
frog cold-blooded none no semi no yes yes amphibian
komodo cold-blooded scales no no no yes no reptile
dragon warm-blooded hair yes no yes yes yes mammal
bat warm-blooded feather no no yes yes no bird
pigeon warm-blooded fur yes no no yes no mammal
cat cold-blooded scales yes yes no no no fish
leopard cold-blooded scales no semi no yes no reptile
shark warm-blooded feather no semi no yes no bird
turtle warm-blooded quills yes no no yes yes mammal
penguin cold-blooded scales no yes no no no fish
porcupine cold-blooded none no semi no yes yes amphibian
Predictive Modelling

⮚ A classiﬁcation model also be used to predict the class label of unknown records

⮚As a black box that automatically assigns a class label when presented
with the attribute set of an unknown record

following characteristics of a creature - gila monster:

Name Body Temp Skin Gives Aquatic Aerial Has Hibern Class
cover birth creature creature legs ates

Gila Monster Cold blooded scales no no no yes yes ?

How Does Classification Work

Data classification - two step process

1. A classifier is built describing a predetermined set of data

classes/concepts- Learning step or Training Phase.

A classification algorithm builds the classifier by analyzing or

learning from a training set (DB tuples and associated class labels)

The individual tuples making up training set are training tuples and
selected from the database under analysis (also known as supervised learning)
How Does Classification Work

The tuple X is represented by n -dimensional attribute Vector

X=[x1,x2,x3…...xn]

X-assumed to belong to predefined class as determined by another

attribute class label attribute.

Data tupels are referred as samples, examples, instances, data

points
or object.
This step is learning of a mapping or function,
y = f(x), predict associated class label y of a given tuple x
How does classification work cont…

 This step is learning of a mapping or function,

y = f(x), predict associated class label y of a given tuple x

 Mapping is represented in the form of classification rules,

decision trees or mathematical formula

 Example of classification rules that identify loan applications as

being either safe or risky

⮚ Rules used to categorize future data tuples, as well as provide

deeper insight into the database contents
Examples - Classification/ Prediction

⮚ Build a classification model to predict the expenditures to categorize bank

loan applications as either safe or risky

⮚ A bank - analyze the loan data in order to learn which loan applicants are
safe/ risky

⮚ The Prediction model to predict the expenditure in dollars of potential

customers on computer equipment by analyzing the income and occupation
What is Classification
In two examples,

The officer analyze the data to learn which loan applicants are “safe” or “risky” for the
bank.

A medical researcher analyze breast cancer data to predict which one of the three
specific treatments a patient should receive.

Here the data analysis task required is classification, Where a model or classifier
is constructed to predict categorical labels, such as

“safe” or “risky” for loan application data

“Treatment A” or “treatment B” or “treatment C” for the medical data.

These categories represented by discrete values no ordering among values has no

meaning.

eg: 1, 2,3 may be used to represent treatments A/ B/C where there is no ordering
among this group
What is Prediction

Marketing Manager -how much a given customer will spend during sale,
then the data analysis is numeric prediction.

The model constructed predicts a continuous-valued function, or ordered

value as opposed to categorical label. The model is Predictor.

Hence classification and numeric prediction - two major types of prediction

problems

Numeric prediction is simply called Prediction.

Data Classification Process

Learning: Training data are analyzed by a classification algorithm

class label attribute is loan decision
learned model or classifier is represented in the form of classification rules

Induction: Model Construction

Classification Accuracy

2. In the second step, the model is used for classification

 First the predictive accuracy of the classifier is estimated

 The training set is used to measure the accuracy

 Therefore, a test set used is made up of test tuples and associated class labels

 These tuples are randomly selected from the general data set, independent of
the training tuples, meaning that they are not used to construct the classifier

 The accuracy of a classifier on a given test set is the percentage of test tuples
that are correctly classified by the classifier
Classification
⮚Test data are used to estimate the accuracy of the classification rules

⮚If accuracy is acceptable, rules can be applied to the classification of new data

Deduction: Using the Model

⮚ The associated class label of each test tuple is compared with
the learned classifiers class prediction for that tuple

⮚ If the accuracy of the classifier is considered acceptable, the

classifier can be used to classify future data tuples for which
the class label is not known

⮚ For ex, the classification rules obtained in the first stage is

used to approve or reject new or future loan applications
How is prediction different from classification cont…
 Data prediction-a two step process, similar to data classification

 For prediction, the class label attribute is lost,

b’s the attribute for which values are being predicted
is continuous-valued(ordered) rather than categorical (unordered)

 The attribute is simply referred as predicted attribute

 Prediction can also be viewed as a mapping or function,

y = f(X), where X-input, and the output y is a continuous or
ordered value

 Prediction and classification also differ in the methods

that are used to build their respective models
How is prediction different from classification cont…

 As with classification, the training set used to build a predictor

should not be used to asses its accuracy

 An independent test set should be used instead

 The accuracy of a predictor is estimated by computing an error

based on the difference between the predicted value and the
actual known value of y each of the test tuples, X

Copyleft
No ratings yet
Copyleft
27 pages
Unit-5_3161610
No ratings yet
Unit-5_3161610
92 pages
DM Unit 4
No ratings yet
DM Unit 4
22 pages
18mca52c U3
No ratings yet
18mca52c U3
8 pages
For More Visit WWW - Ktunotes.in
No ratings yet
For More Visit WWW - Ktunotes.in
21 pages
Classification (Part II)
No ratings yet
Classification (Part II)
162 pages
ML Unit-2
No ratings yet
ML Unit-2
51 pages
Big Data Analytics - Unit 3
No ratings yet
Big Data Analytics - Unit 3
55 pages
Unit6 -1 Classification-and-Prediction-Basics_3a2ac6b1-316a-4e6b-b18f-efed2317596b
No ratings yet
Unit6 -1 Classification-and-Prediction-Basics_3a2ac6b1-316a-4e6b-b18f-efed2317596b
12 pages
Chp8 (Topic Not in Book) - ClassificationPrediction+Issues
No ratings yet
Chp8 (Topic Not in Book) - ClassificationPrediction+Issues
7 pages
Classification and Prediction
No ratings yet
Classification and Prediction
41 pages
Data Mining Classification Prediction
No ratings yet
Data Mining Classification Prediction
3 pages
u4 clasification and prediction
No ratings yet
u4 clasification and prediction
15 pages
Classification Unit3
No ratings yet
Classification Unit3
15 pages
Data Mining UNIT-2 Notes
No ratings yet
Data Mining UNIT-2 Notes
91 pages
Data Mining Classification and Prediction
No ratings yet
Data Mining Classification and Prediction
17 pages
Classification
No ratings yet
Classification
22 pages
DATA MINING MODULE 3
No ratings yet
DATA MINING MODULE 3
27 pages
UNIT 3 DM
No ratings yet
UNIT 3 DM
34 pages
classify vs pedict
No ratings yet
classify vs pedict
6 pages
Unit-3
No ratings yet
Unit-3
53 pages
Classification: Unit-III
No ratings yet
Classification: Unit-III
90 pages
19-Introduction classification algorithm-18-09-2024
No ratings yet
19-Introduction classification algorithm-18-09-2024
102 pages
FALLSEM2024-25_BCSE209L_TH_VL2024250101735_2024-07-25_Reference-Material-I
No ratings yet
FALLSEM2024-25_BCSE209L_TH_VL2024250101735_2024-07-25_Reference-Material-I
37 pages
8.predictive Analytics - Classification 2
No ratings yet
8.predictive Analytics - Classification 2
28 pages
Classification and Prediction Lecture-22,23,24,25,26,27, 28: Dr. Sudhir Sharma Manipal University Jaipur
No ratings yet
Classification and Prediction Lecture-22,23,24,25,26,27, 28: Dr. Sudhir Sharma Manipal University Jaipur
43 pages
Classification and Predication in Data Mining
No ratings yet
Classification and Predication in Data Mining
6 pages
Unit 4 ML
No ratings yet
Unit 4 ML
28 pages
Data Mining-Unit-3
No ratings yet
Data Mining-Unit-3
16 pages
Lecture-5 Classification in ML
No ratings yet
Lecture-5 Classification in ML
50 pages
Lecture 9
No ratings yet
Lecture 9
27 pages
331mt 3.1 (1)
No ratings yet
331mt 3.1 (1)
36 pages
CCPS521 WIN2023 Week05 - Classification
No ratings yet
CCPS521 WIN2023 Week05 - Classification
47 pages
Classification and Prediction
No ratings yet
Classification and Prediction
14 pages
Chapter 4 Classification
No ratings yet
Chapter 4 Classification
78 pages
MACHINE LEARNING-CLASSIFICATION
No ratings yet
MACHINE LEARNING-CLASSIFICATION
52 pages
DM mod-3
No ratings yet
DM mod-3
66 pages
V1-CH-6-Classification and Prediction
No ratings yet
V1-CH-6-Classification and Prediction
38 pages
9 Data Mining - Classification & Prediction
No ratings yet
9 Data Mining - Classification & Prediction
4 pages
Unit 8 Classification and Prediction: Structure
No ratings yet
Unit 8 Classification and Prediction: Structure
16 pages
Unit Iii Classification
No ratings yet
Unit Iii Classification
57 pages
Classification
No ratings yet
Classification
15 pages
Classification & Prediction: - Shailesh Yadav Central University of Rajasthan
No ratings yet
Classification & Prediction: - Shailesh Yadav Central University of Rajasthan
28 pages
DATA MINING JNTUH CSE R18
No ratings yet
DATA MINING JNTUH CSE R18
20 pages
ML Unit 2
No ratings yet
ML Unit 2
31 pages
Classification & Prediction
No ratings yet
Classification & Prediction
19 pages
Classification Algorithm
No ratings yet
Classification Algorithm
78 pages
New Classification11
No ratings yet
New Classification11
98 pages
ICS 2408 - Lecture 6 - Classification and Prediction
No ratings yet
ICS 2408 - Lecture 6 - Classification and Prediction
47 pages
CH 8 Data Mining
No ratings yet
CH 8 Data Mining
30 pages
KNN and Baysian Method
No ratings yet
KNN and Baysian Method
43 pages
Data Mining 5 Semester Bca
No ratings yet
Data Mining 5 Semester Bca
44 pages
DMW Module 3
No ratings yet
DMW Module 3
112 pages
DMW Module 3
No ratings yet
DMW Module 3
112 pages
ICS 2408 - Lecture 6 - Classification and Prediction
No ratings yet
ICS 2408 - Lecture 6 - Classification and Prediction
47 pages
Week 4 Part 1 Classification
No ratings yet
Week 4 Part 1 Classification
71 pages
Unit 4 Classification
No ratings yet
Unit 4 Classification
87 pages
ML Notes -2025
No ratings yet
ML Notes -2025
145 pages
ML UNIT-II
No ratings yet
ML UNIT-II
37 pages
L24 Classification
No ratings yet
L24 Classification
40 pages
Let's Classify Animals!
From Everand
Let's Classify Animals!
Kelli Hicks
No ratings yet
Project 1 - Research Organizer
No ratings yet
Project 1 - Research Organizer
9 pages
Mobile Application Prototyping With Python For S60: Bernhard Famler, BSC
No ratings yet
Mobile Application Prototyping With Python For S60: Bernhard Famler, BSC
8 pages
Find Deals On Hotels: Where Are You Going?
No ratings yet
Find Deals On Hotels: Where Are You Going?
8 pages
Enable Backend TXN As A Tile in Fiori Launchpad
100% (1)
Enable Backend TXN As A Tile in Fiori Launchpad
4 pages
Green Leaf Presentation Template
No ratings yet
Green Leaf Presentation Template
25 pages
EE270 Lec Notes 5
No ratings yet
EE270 Lec Notes 5
6 pages
Porter's Generic Competitive Strategies (Ways of Competing)
No ratings yet
Porter's Generic Competitive Strategies (Ways of Competing)
2 pages
Catalog Generation
No ratings yet
Catalog Generation
42 pages
Software Platform User Manual
No ratings yet
Software Platform User Manual
15 pages
Intel® Storage Server SSR212MC2: Technical Product Specification
No ratings yet
Intel® Storage Server SSR212MC2: Technical Product Specification
47 pages
MI 3394 - CE Multitester XA ANG Ver 3.10.22 20752432
No ratings yet
MI 3394 - CE Multitester XA ANG Ver 3.10.22 20752432
158 pages
COGS-Engr. Cherryl Mae. Almojuela
No ratings yet
COGS-Engr. Cherryl Mae. Almojuela
1 page
3O Mariot F-006 - Junior Deck Officer's Familiarization Checklist & Handover Record
No ratings yet
3O Mariot F-006 - Junior Deck Officer's Familiarization Checklist & Handover Record
8 pages
The Color Management Handbook For Visual Effects Artists Victor Perez download
100% (1)
The Color Management Handbook For Visual Effects Artists Victor Perez download
76 pages
Mongodb AWS Cloud Migration
No ratings yet
Mongodb AWS Cloud Migration
12 pages
Ch01 Dss Turban at
No ratings yet
Ch01 Dss Turban at
56 pages
Smartgrow: A Tracking System For Auto-Hydroponic Indoor Fodder Grow Chamber
No ratings yet
Smartgrow: A Tracking System For Auto-Hydroponic Indoor Fodder Grow Chamber
58 pages
SRM Admin 6 5 PDF
No ratings yet
SRM Admin 6 5 PDF
188 pages
Senior Technician, Fire & Gas Maintenance - ADNOC OFFSHORE - JD
No ratings yet
Senior Technician, Fire & Gas Maintenance - ADNOC OFFSHORE - JD
2 pages
Learn Java JPA Spring For Beginners To Expert Professional and Attend The Interviews (Amit K) (Z-Library)
No ratings yet
Learn Java JPA Spring For Beginners To Expert Professional and Attend The Interviews (Amit K) (Z-Library)
384 pages
NasbyG - 2022 - Intro To Cybersecurity For SCADA ISA 62443 - CanadianWaterConference - Nov2021 - Slides
No ratings yet
NasbyG - 2022 - Intro To Cybersecurity For SCADA ISA 62443 - CanadianWaterConference - Nov2021 - Slides
43 pages
Instant Download Brihat Parasara Hora Sastra 1st Edition Maharshi Parasara PDF All Chapters
100% (10)
Instant Download Brihat Parasara Hora Sastra 1st Edition Maharshi Parasara PDF All Chapters
85 pages
S2927-E_DataSheet
No ratings yet
S2927-E_DataSheet
2 pages
Machine Learning Foundations: Supervised, Unsupervised, and Advanced Learning Taeho Jo - Download the ebook now to never miss important information
100% (2)
Machine Learning Foundations: Supervised, Unsupervised, and Advanced Learning Taeho Jo - Download the ebook now to never miss important information
70 pages
Mindray M9 - User Manual
100% (1)
Mindray M9 - User Manual
255 pages
Enable The Real-Time Attachment Scanning For Outgoing Mail Sent Via The Web Client - Zimbra - Tech Center
No ratings yet
Enable The Real-Time Attachment Scanning For Outgoing Mail Sent Via The Web Client - Zimbra - Tech Center
2 pages
AI Chapter1 SAV
No ratings yet
AI Chapter1 SAV
28 pages
5) Forêt
No ratings yet
5) Forêt
18 pages
Lm358 Application
No ratings yet
Lm358 Application
4 pages

10.Classification2022

Uploaded by

10.Classification2022

Uploaded by

Classification and Prediction

Classification and Prediction

Two forms of data analysis

⮚ Classification predicts - categorical (discrete, unordered) labels,

⮚ Prediction models - continuous valued functions

⮚ Classification and Prediction have numerous applications

including fraud detection, target marketing, performance prediction, manufacturing,

and medical diagnosis

Classification: task of assigning objects to one of several predefined categories

Input data for classiﬁcation-collection of records/ instance

⮚This is a key characteristic distinguishes classiﬁcation from regression, a

⮚ Classification: Y is a discrete attribute, called the class label

⮚ Prediction: Y is a continuous attribute

⮚ Classification – task of learning a target function “f ” that maps each attribute

⮚ The target function informally called as Classification Model

Classification model used for

⮚ Classification Model serve an explanatory tool to distinguish between objects

Vertebrate Data Set

following characteristics of a creature - gila monster:

Gila Monster Cold blooded scales no no no yes yes ?

Data classification - two step process

1. A classifier is built describing a predetermined set of data

A classification algorithm builds the classifier by analyzing or

The tuple X is represented by n -dimensional attribute Vector

X-assumed to belong to predefined class as determined by another

Data tupels are referred as samples, examples, instances, data

 This step is learning of a mapping or function,

 Mapping is represented in the form of classification rules,

 Example of classification rules that identify loan applications as

⮚ Rules used to categorize future data tuples, as well as provide

⮚ Build a classification model to predict the expenditures to categorize bank

⮚ The Prediction model to predict the expenditure in dollars of potential

“safe” or “risky” for loan application data

These categories represented by discrete values no ordering among values has no

The model constructed predicts a continuous-valued function, or ordered

Hence classification and numeric prediction - two major types of prediction

Numeric prediction is simply called Prediction.

Learning: Training data are analyzed by a classification algorithm

Induction: Model Construction

2. In the second step, the model is used for classification

 First the predictive accuracy of the classifier is estimated

 The training set is used to measure the accuracy

Deduction: Using the Model

⮚ If the accuracy of the classifier is considered acceptable, the

⮚ For ex, the classification rules obtained in the first stage is

 For prediction, the class label attribute is lost,

 The attribute is simply referred as predicted attribute

 Prediction can also be viewed as a mapping or function,

 Prediction and classification also differ in the methods

 As with classification, the training set used to build a predictor

 An independent test set should be used instead

 The accuracy of a predictor is estimated by computing an error

You might also like