0% found this document useful (0 votes)

87 views19 pages

Data Mining Techniques & Uses

This document discusses data mining and provides an overview of key concepts and techniques. It defines data mining as the process of analyzing large databases to find useful patterns. It describes common techniques such as classification, clustering, regression, and association rule mining. It outlines examples of how these techniques are used and the advantages of data mining in discovering new knowledge from existing data to improve products, services, and profits. However, it also notes privacy concerns can arise when linking multiple data sources to gain a wide spectrum of customer data.

Uploaded by

NikitaSomaiya

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

87 views19 pages

Data Mining Techniques & Uses

Uploaded by

NikitaSomaiya

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

Data Mining

Nikita K Somaiya
MIM-14-08
Business Intelligence and Analysis
IES MCRC

Overview

Introduction
Explanation of Data Mining
Techniques
Advantages
Applications
Privacy

Data Mining

What is Data Mining?

The process of semi automatically analyzing large
databases to find useful patterns (Silberschatz)
KDD Knowledge Discovery in Databases (3)
Attempts to discover rules and patterns from data
Discover Rules Make Predictions
Areas of Use

Internet Discover needs of customers

Economics Predict stock prices
Science Predict environmental change
Medicine Match patients with similar problems cure

Example of Data Mining

Credit Card Company wants to discover information

about clients from databases. Want to find:

Clients who respond to promotions in Junk Mail

Clients that are likely to change to another
competitor
Clients that are likely to not pay
Services that clients use to try to promote services
affiliated with the Credit Card Company
Anything else that may help the Company provide/
promote services to help their clients and ultimately
make more money.

Data Mining & Data

Warehousing

Data Warehouse: is a repository (or archive) of

information gathered from multiple sources, stored
under a unified schema, at a single site.
(Silberschatz)

Collect data Store in single repository

Allows for easier query development as a single
repository can be queried.

Data Mining:

Analyzing databases or Data Warehouses to discover

patterns about the data to gain knowledge.
Knowledge is power.

Discovery of Knowledge

Data Mining Techniques

Classification
Clustering
Regression
Association Rules

Classification

Classification: Given a set of items that have several classes,

and given the past instances (training instances) with their
associated class, Classification is the process of predicting the
class of a new item.
Therefore to classify the new item and identify to which class
it belongs
Example: A bank wants to classify its Home Loan Customers
into groups according to their response to bank
advertisements. The bank might use the classifications
Responds Rarely, Responds Sometimes, Responds
Frequently.
The bank will then attempt to find rules about the customers
that respond Frequently and Sometimes.
The rules could be used to predict needs of potential
customers.

Technique for
Classification

Decision-Tree Classifiers
Job
Engineer

Carpenter

Income
<30
K

Bad

>50
K

Good

Income
<40
K

>90
K

Bad

Good

Doctor

Income
>100K

<50
K

Bad

Predicting credit risk of a person with the jobs

Good

Clustering

Clustering algorithms find groups of items that are

similar. It divides a data set so that records with
similar content are in the same group, and groups
are as different as possible from each other. (2)

Example: Insurance company could use clustering

to group clients by their age, location and types of
insurance purchased.

The categories are unspecified and this is referred

to as unsupervised learning

Clustering

Group Data into Clusters

Similar data is grouped in the same cluster

Dissimilar data is grouped in the same cluster

How is this achieved ?

K-Nearest Neighbor

A classification method that classifies a point by

calculating the distances between the point and
points in the training data set. Then it assigns the
point to the class that is most common among its
k-nearest neighbors (where k is an integer).(2)

Hierarchical

Group data into t-trees

Regression

Regression deals with the prediction of a value,

rather than a class. (1, P747)
Example: Find out if there is a relationship between
smoking patients and cancer related illness.
Given values: X1, X2... Xn
Objective predict variable Y
One way is to predict coefficients a0, a1, a2

Y = a0 + a1X1 + a2X2 + anXn

Linear Regression

Regression

Example graph:

Line of Best Fit

Curve Fitting

Association Rules

An association algorithm creates rules that

describe how often events have occurred
together. (2)

Example: When a customer buys a hammer,

then 90% of the time they will buy nails.

Association Rules

Support: is a measure of what fraction of the

population satisfies both the antecedent and
the consequent of the rule(1, p748)
Example:

People who buy hotdog buns also buy hotdog

sausages in 99% of cases. = High Support
People who buy hotdog buns buy hangers in 0.005%
of cases. = Low support

Situations where there is high support for the

antecedent are worth careful attention

E.g. Hotdog sausages should be placed in near hotdog

buns in supermarkets if there is also high confidence.

Association Rules

Confidence: is a measure of how often the consequent

is true when the antecedent is true. (1, p748)
Example:

90% of Hotdog bun purchases are accompanied by

hotdog sausages.
High confidence is meaningful as we can derive rules.

Hotdog bun Hotdog sausage

2 rules may have different confidence levels and
have the same support.
E.g. Hotdog sausage Hotdog bun may have a
much lower confidence than Hotdog bun Hotdog
sausage yet they both can have the same support.

Advantages of Data Mining

Provides new knowledge from existing data

Public databases
Government sources
Company Databases

Old data can be used to develop new knowledge

New knowledge can be used to improve services or

products

Improvements lead to:

Bigger profits
More efficient service

Uses of Data Mining

Sales/ Marketing

Risk Assessment

Identify Customers that pose high credit risk

Fraud Detection

Diversify target market

Identify clients needs to increase response rates

Identify people misusing the system. E.g. People who

have two Social Security Numbers

Customer Care

Identify customers likely to change providers

Identify customer needs

Privacy Concerns

Effective Data Mining requires large sources of data

To achieve a wide spectrum of data, link multiple
data sources
Linking sources leads can be problematic for
privacy as follows: If the following histories of a
customer were linked:

Shopping History
Credit History
Bank History
Employment History

The users life story can be painted from the

collected data

Presentation 1
No ratings yet
Presentation 1
28 pages
Data Mining Techniques and Applications
No ratings yet
Data Mining Techniques and Applications
39 pages
Data Mining: Techniques and Applications
No ratings yet
Data Mining: Techniques and Applications
63 pages
Comprehensive Guide to Data Mining
No ratings yet
Comprehensive Guide to Data Mining
90 pages
Understanding Data Mining Techniques
No ratings yet
Understanding Data Mining Techniques
30 pages
Data Mining Concepts - Binary
No ratings yet
Data Mining Concepts - Binary
22 pages
Week-1-Introduction To Data Mining
No ratings yet
Week-1-Introduction To Data Mining
43 pages
Data Mining Technique Using Weka Tool
No ratings yet
Data Mining Technique Using Weka Tool
21 pages
Introduction To Data Mining Unit1
100% (1)
Introduction To Data Mining Unit1
37 pages
Understanding Data Mining Techniques
No ratings yet
Understanding Data Mining Techniques
30 pages
Big Data 4 (3 - 4)
No ratings yet
Big Data 4 (3 - 4)
13 pages
Introduction to Data Mining Concepts
No ratings yet
Introduction to Data Mining Concepts
43 pages
Data Mining Techniques and Applications
100% (1)
Data Mining Techniques and Applications
28 pages
Combinepdf 1
No ratings yet
Combinepdf 1
74 pages
Introduction Lecture1gghhhhh
No ratings yet
Introduction Lecture1gghhhhh
23 pages
Introduction to Data Mining Concepts
No ratings yet
Introduction to Data Mining Concepts
24 pages
5 Data Mining Proccess and Techniques - Week 7
No ratings yet
5 Data Mining Proccess and Techniques - Week 7
61 pages
Datamining: by Guan Hang Su Cs157A Section 2 Fall 2005
0% (1)
Datamining: by Guan Hang Su Cs157A Section 2 Fall 2005
31 pages
Data Mining: Techniques and Applications
No ratings yet
Data Mining: Techniques and Applications
44 pages
CH 2
No ratings yet
CH 2
37 pages
Data Mining Techniques and Applications
No ratings yet
Data Mining Techniques and Applications
16 pages
Introduction to Data Mining Concepts
No ratings yet
Introduction to Data Mining Concepts
69 pages
Introduction to Data Mining Basics
No ratings yet
Introduction to Data Mining Basics
43 pages
Topic 3 Data Mining For Business Intelligence
No ratings yet
Topic 3 Data Mining For Business Intelligence
49 pages
IS352 - Lecture 01
No ratings yet
IS352 - Lecture 01
62 pages
Data Mining Course Overview
No ratings yet
Data Mining Course Overview
41 pages
CSM6404 DM L1
No ratings yet
CSM6404 DM L1
29 pages
Data Mining for Business Insights
No ratings yet
Data Mining for Business Insights
52 pages
Introduction to Data Mining Techniques
No ratings yet
Introduction to Data Mining Techniques
23 pages
Data Mining Techniques and Benefits
No ratings yet
Data Mining Techniques and Benefits
22 pages
Understanding Data Mining Techniques
No ratings yet
Understanding Data Mining Techniques
20 pages
Intro to Data Mining Course Overview
No ratings yet
Intro to Data Mining Course Overview
62 pages
Data Mining for Business Insights
100% (3)
Data Mining for Business Insights
11 pages
Introduction to Data Mining Techniques
100% (1)
Introduction to Data Mining Techniques
31 pages
Data Mining: Applications and Techniques
No ratings yet
Data Mining: Applications and Techniques
60 pages
Data Mining Concepts and Applications
No ratings yet
Data Mining Concepts and Applications
38 pages
Understanding Data Mining Concepts
No ratings yet
Understanding Data Mining Concepts
31 pages
Data Mining and Warehousing Explained
No ratings yet
Data Mining and Warehousing Explained
42 pages
Data Mining ppt-1
No ratings yet
Data Mining ppt-1
16 pages
Data Mining Tasks
No ratings yet
Data Mining Tasks
26 pages
Data Mining Overview and Applications
No ratings yet
Data Mining Overview and Applications
30 pages
DWDM Unit 4
No ratings yet
DWDM Unit 4
43 pages
Data Mining
No ratings yet
Data Mining
9 pages
Introduction to Data Mining Concepts
No ratings yet
Introduction to Data Mining Concepts
59 pages
Data Mining - Prashant
No ratings yet
Data Mining - Prashant
10 pages
Data Mining
No ratings yet
Data Mining
3 pages
Understanding Data Mining Techniques
No ratings yet
Understanding Data Mining Techniques
18 pages
Data Mining Techniques Overview
No ratings yet
Data Mining Techniques Overview
44 pages
Data Mining Overview by Archana Ketkar
No ratings yet
Data Mining Overview by Archana Ketkar
24 pages
Datamining Fifth Lecture
No ratings yet
Datamining Fifth Lecture
65 pages
Data Mining Techniques and Applications
No ratings yet
Data Mining Techniques and Applications
14 pages
Chapter 1 Data Mining Lecture Note
No ratings yet
Chapter 1 Data Mining Lecture Note
31 pages
Data Mining Course Overview
No ratings yet
Data Mining Course Overview
38 pages
Essential Data Mining Techniques Explained
No ratings yet
Essential Data Mining Techniques Explained
24 pages
What Is Not Data Mining - Ex: Generation of Attendance Report (Of A Course) From Registration Cards. - Student Table (STD)
No ratings yet
What Is Not Data Mining - Ex: Generation of Attendance Report (Of A Course) From Registration Cards. - Student Table (STD)
33 pages
Chapter 5 - Data Mining
No ratings yet
Chapter 5 - Data Mining
29 pages
10 Data Mining
No ratings yet
10 Data Mining
21 pages
Data Mining and Warehouse Insights
No ratings yet
Data Mining and Warehouse Insights
10 pages
Mohammad Adnan Sheikh, Div C, Roll No 42
No ratings yet
Mohammad Adnan Sheikh, Div C, Roll No 42
48 pages
DUBAI - Burj Khalifa
100% (1)
DUBAI - Burj Khalifa
15 pages
Testbank For Empowerment Series Understanding Human Behavior and The Social Environment 11th Edition Zastrow
No ratings yet
Testbank For Empowerment Series Understanding Human Behavior and The Social Environment 11th Edition Zastrow
18 pages
GJUST Online Exam Guidelines 2021
No ratings yet
GJUST Online Exam Guidelines 2021
2 pages
Extinguishment of Pledge 1. Payment of The Debt
No ratings yet
Extinguishment of Pledge 1. Payment of The Debt
2 pages
GPR 88 Diesel Generator Specifications
No ratings yet
GPR 88 Diesel Generator Specifications
3 pages
Headmaster Role and Responsibilities Guide
100% (2)
Headmaster Role and Responsibilities Guide
4 pages
Organic Chemistry and Polymers
No ratings yet
Organic Chemistry and Polymers
27 pages
Impact of Public Procurement Policies and Procedures On The Health Delivery System in Zimbabwe: Chitungwiza Central Hospital
100% (10)
Impact of Public Procurement Policies and Procedures On The Health Delivery System in Zimbabwe: Chitungwiza Central Hospital
61 pages
Kashmir Conflict PDF
No ratings yet
Kashmir Conflict PDF
76 pages
Understanding Philosophy and Ethics
No ratings yet
Understanding Philosophy and Ethics
11 pages
Tosoh AIA-PACK Multi Analyte Control
No ratings yet
Tosoh AIA-PACK Multi Analyte Control
2 pages
Understanding Attitudes in Psychology
No ratings yet
Understanding Attitudes in Psychology
22 pages
Microorganisms Friends and Foes
No ratings yet
Microorganisms Friends and Foes
8 pages
SIMOCODE Pro: Advanced Motor Management
No ratings yet
SIMOCODE Pro: Advanced Motor Management
12 pages
HRDCir2853 12 PDF
100% (1)
HRDCir2853 12 PDF
21 pages
Mohamed Mohy Eldin
No ratings yet
Mohamed Mohy Eldin
3 pages
4.2hf 2016 Rhymes LKG Ukg
No ratings yet
4.2hf 2016 Rhymes LKG Ukg
6 pages
Describing Lost Items in English
No ratings yet
Describing Lost Items in English
40 pages
Gram Panchayat - Sirudamur PAI Score Card
No ratings yet
Gram Panchayat - Sirudamur PAI Score Card
1 page
Ascent ESS User Manual Guide
No ratings yet
Ascent ESS User Manual Guide
19 pages
JR Inter (Batch-I) Pre Final-1 Papers
No ratings yet
JR Inter (Batch-I) Pre Final-1 Papers
8 pages
500kVA Hotel Load Converter Overview
No ratings yet
500kVA Hotel Load Converter Overview
43 pages
Navigating Parent-Child Relationships
No ratings yet
Navigating Parent-Child Relationships
1 page
Ramon Magsaysay Memorial College-Marbel INC Bachelor of Science in Tourism Management
No ratings yet
Ramon Magsaysay Memorial College-Marbel INC Bachelor of Science in Tourism Management
7 pages
Teufel v. U.S.: Negligence Appeal Affirmed
No ratings yet
Teufel v. U.S.: Negligence Appeal Affirmed
4 pages
Tributo a Lucho Bermúdez: Cumbia
No ratings yet
Tributo a Lucho Bermúdez: Cumbia
1 page
English Language Test Key
No ratings yet
English Language Test Key
7 pages
ICS 13.060.30 Supersedes DIN 19643 1:1997 04: November 2012
No ratings yet
ICS 13.060.30 Supersedes DIN 19643 1:1997 04: November 2012
61 pages
Upper Limb Prosthetic
No ratings yet
Upper Limb Prosthetic
41 pages
History and Cultivation of Mangoes
No ratings yet
History and Cultivation of Mangoes
20 pages

Data Mining Techniques & Uses

Uploaded by

Data Mining Techniques & Uses

Uploaded by

Data Mining

What is Data Mining?

Internet Discover needs of customers

Example of Data Mining

Credit Card Company wants to discover information

Clients who respond to promotions in Junk Mail

Data Mining & Data

Data Warehouse: is a repository (or archive) of

Collect data Store in single repository

Analyzing databases or Data Warehouses to discover

Data Mining Techniques

Classification: Given a set of items that have several classes,

Predicting credit risk of a person with the jobs

Clustering algorithms find groups of items that are

Example: Insurance company could use clustering

The categories are unspecified and this is referred

Group Data into Clusters

Similar data is grouped in the same cluster

How is this achieved ?

A classification method that classifies a point by

Group data into t-trees

Regression deals with the prediction of a value,

Y = a0 + a1X1 + a2X2 + anXn

Line of Best Fit

An association algorithm creates rules that

Example: When a customer buys a hammer,

then 90% of the time they will buy nails.

Support: is a measure of what fraction of the

People who buy hotdog buns also buy hotdog

Situations where there is high support for the

E.g. Hotdog sausages should be placed in near hotdog

Confidence: is a measure of how often the consequent

90% of Hotdog bun purchases are accompanied by

Hotdog bun Hotdog sausage

Advantages of Data Mining

Provides new knowledge from existing data

Old data can be used to develop new knowledge

New knowledge can be used to improve services or

Improvements lead to:

Uses of Data Mining

Identify Customers that pose high credit risk

Diversify target market

Identify people misusing the system. E.g. People who

Identify customers likely to change providers

Effective Data Mining requires large sources of data

The users life story can be painted from the

You might also like