0% found this document useful (0 votes)

360 views14 pages

Data Mining:: Association Rules Techniques

This document discusses association rule mining techniques for finding frequent patterns and correlations among items in transactional databases. Association rule mining involves finding rules that connect one set of items with another based on measures of support and confidence. The Apriori algorithm is used to efficiently find all frequent itemsets in a database and generate strong association rules from those itemsets. While support and confidence are commonly used measures, they have limitations and other measures like lift and interest have been proposed to better evaluate interestingness.

Uploaded by

Avi Senna Gunadi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

360 views14 pages

Data Mining:: Association Rules Techniques

Uploaded by

Avi Senna Gunadi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 14

Data Mining:

Association Rules Techniques

December 9, 2021 1
What Is Association Mining?

 Association rule mining:

 Finding frequent patterns, associations, correlations among

sets of items or objects in transaction databases, relational

databases, and other information repositories.
 Applications:
 Basket data analysis, cross-marketing, catalog design, loss-

leader analysis, clustering, classification, etc.

 Examples.
 Rule form: “Body ead [support, confidence]”.

 buys(x, “diapers”)  buys(x, “beers”) [0.5%, 60%]

 major(x, “CS”) ^ takes(x, “DB”) grade(x, “A”) [1%, 75%]

December 9, 2021 2
Association Rule: Basic Concepts

 Given: (1) database of transactions, (2) each transaction is a

list of items (purchased by a customer in a visit)
 Find: all rules that correlate the presence of one set of items
with that of another set of items
 E.g., 98% of people who purchase tires and auto

accessories also get automotive services done

December 9, 2021 3
Rule Measures: Support and
Confidence
Customer
buys both
Customer  Find all the rules X & Y  Z with
buys diaper
minimum confidence and support
 support, s, probability that a

transaction contains {X, Y, Z}

 confidence, c, conditional

Customer probability that a transaction

buys beer having {X, Y} also contains Z

Transaction ID Items Bought Let minimum support 50%, and

2000 A,B,C minimum confidence 50%, we have
 A  C (50%, 66.6%)
1000 A,C
 C  A (50%, 100%)
4000 A,D
5000 B,E,F
December 9, 2021 4
Association Rule Mining: A Road Map

 Boolean vs. quantitative associations (Based on the

types of values handled)
 buys(x, “SQLServer”) ^ buys(x, “DMBook”)

buys(x, “DBMiner”) [0.2%, 60%]

 age(x, “30..39”) ^ income(x, “42..48K”) buys(x,

“PC”) [1%, 75%]

 Single dimension vs. multiple dimensional associations
(see ex. Above)
 Single level vs. multiple-level analysis
 What brands of beers are associated with what

brands of diapers?
December 9, 2021 5
Mining Association Rules—An Example

Transaction ID Items Bought Min. support 50%

2000 A,B,C Min. confidence 50%
1000 A,C
4000 A,D Frequent Itemset Support
{A} 75%
5000 B,E,F
{B} 50%
{C} 50%
For rule A  C : {A,C} 50%
support = support({A, C}) = 50%
confidence = support({A, C})/support({A}) = 66.6%
The Apriori principle:
Any subset of a frequent itemset must be frequent
December 9, 2021 6
Mining Frequent Itemsets: the
Key Step
 Find the frequent itemsets: the sets of items that have
minimum support
 A subset of a frequent itemset must also be a frequent
itemset
 i.e., if {AB} is a frequent itemset, both {A} and {B} should be a
frequent itemset
 Iteratively find frequent itemsets with cardinality from 1 to k
(k-itemset)
 Use the frequent itemsets to generate association rules.

December 9, 2021 7
The Apriori Algorithm
 Join Step: C is generated by joining L with itself
k k-1

 Prune Step: Any (k-1)-itemset that is not frequent cannot be a subset of a

frequent k-itemset
 Pseudo-code:
Ck: Candidate itemset of size k
Lk : frequent itemset of size k
L1 = {frequent items};
for (k = 1; Lk !=; k++) do begin
Ck+1 = candidates generated from Lk;
for each transaction t in database do
increment the count of all candidates in Ck+1 that are
contained in t
Lk+1 = candidates in Ck+1 with min_support
end
return k Lk;

December 9, 2021 8
The Apriori Algorithm — Example
Database D itemset sup.
L1 itemset sup.
TID Items C1 {1} 2 {1} 2
100 134 {2} 3 {2} 3
200 235 Scan D {3} 3 {3} 3
300 1235 {4} 1 {5} 3
400 25 {5} 3
C2 itemset sup C2 itemset
L2 itemset sup {1 2} 1 Scan D {1 2}
{1 3} 2 {1 3} 2 {1 3}
{2 3} 2 {1 5} 1 {1 5}
{2 3} 2 {2 3}
{2 5} 3
{2 5} 3 {2 5}
{3 5} 2
{3 5} 2 {3 5}
C3 itemset Scan D L3 itemset sup
{2 3 5} {2 3 5} 2
December 9, 2021 9
Interestingness Measurements
 Objective measures
Two popular measurements:
 support; and

 confidence

 Subjective measures (Silberschatz & Tuzhilin,

KDD95)
A rule (pattern) is interesting if
 it is unexpected (surprising to the user); and/or

 actionable (the user can do something with it)

December 9, 2021 10
Criticism to Support and Confidence
 Example 1: (Aggarwal & Yu, PODS98)
 Among 5000 students

3000 play basketball

3750 eat cereal

2000 both play basket ball and eat cereal
 play basketball  eat cereal [40%, 66.7%] is misleading

because the overall percentage of students eating cereal is 75%

which is higher than 66.7%.
 play basketball  not eat cereal [20%, 33.3%] is far more

accurate, although with lower support and confidence

basketball not basketball sum(row)
cereal 2000 1750 3750
not cereal 1000 250 1250
sum(col.) 3000 2000 5000
December 9, 2021 11
Criticism to Support and Confidence
(Cont.)
 Example 2:
 X and Y: positively correlated, X 1 1 1 1 0 0 0 0
 X and Z, negatively related Y 1 1 0 0 0 0 0 0
 support and confidence of
Z 0 1 1 1 1 1 1 1
X=>Z dominates
 We need a measure of dependent
or correlated events Rule Support Confidence
P ( A B) X=>Y 25% 50%
corrA, B 
P ( A) P( B) X=>Z 37.50% 75%
 P(B|A)/P(B) is also called the lift
of rule A => B
December 9, 2021 12
Other Interestingness Measures: Interest
 Interest (correlation, lift) P( A  B)
P ( A) P ( B )
 taking both P(A) and P(B) in consideration
 P(A^B)=P(B)*P(A), if A and B are independent events
 A and B negatively correlated, if the value is less than 1;
otherwise A and B positively correlated
Itemset Support Interest
X 1 1 1 1 0 0 0 0
X,Y 25% 2
Y 1 1 0 0 0 0 0 0 X,Z 37.50% 0.9
Z 0 1 1 1 1 1 1 1 Y,Z 12.50% 0.57

December 9, 2021 13
Summary

 Association rule mining

 probably the most significant contribution from the
database community in KDD
 A large number of papers have been published
 Many interesting issues have been explored
 An interesting research direction
 Association analysis in other types of data: spatial
data, multimedia data, time series data, etc.

December 9, 2021 14

Association Rule Mining Overview
No ratings yet
Association Rule Mining Overview
14 pages
Data Mining-Knowledge Presentation 2: Prof. Sin-Min Lee
No ratings yet
Data Mining-Knowledge Presentation 2: Prof. Sin-Min Lee
54 pages
Lecture 2.3.1 2.3.2
No ratings yet
Lecture 2.3.1 2.3.2
23 pages
Association Rules
No ratings yet
Association Rules
33 pages
Association Rule Mining
No ratings yet
Association Rule Mining
72 pages
Data Mining: Association Rules
No ratings yet
Data Mining: Association Rules
43 pages
Unit 5
No ratings yet
Unit 5
40 pages
Rani 2
No ratings yet
Rani 2
98 pages
Enhancing Apriori Algorithm Efficiency
No ratings yet
Enhancing Apriori Algorithm Efficiency
27 pages
Contents
No ratings yet
Contents
59 pages
Association Rule Mining Guide
No ratings yet
Association Rule Mining Guide
30 pages
Market Basket Analysis
No ratings yet
Market Basket Analysis
27 pages
Association Rule Mining
No ratings yet
Association Rule Mining
24 pages
Inbound 5799672056943946753
No ratings yet
Inbound 5799672056943946753
47 pages
Top 9 Data Science Algorithms
No ratings yet
Top 9 Data Science Algorithms
152 pages
Lecture 5
No ratings yet
Lecture 5
43 pages
Apriori
No ratings yet
Apriori
27 pages
Module 5 - Frequent Pattern Mining
No ratings yet
Module 5 - Frequent Pattern Mining
111 pages
Unsupervised Learning Essentials
No ratings yet
Unsupervised Learning Essentials
64 pages
Association Rule Mining Guide
No ratings yet
Association Rule Mining Guide
91 pages
Apriori Algorithm in Association Analysis
No ratings yet
Apriori Algorithm in Association Analysis
32 pages
Session 8-Association Rules Mining
No ratings yet
Session 8-Association Rules Mining
75 pages
Association Rule Mining Guide
No ratings yet
Association Rule Mining Guide
44 pages
Understanding the Apriori Algorithm
No ratings yet
Understanding the Apriori Algorithm
28 pages
Association Rules in Data Mining
No ratings yet
Association Rules in Data Mining
68 pages
1association Analysis-Apriori
No ratings yet
1association Analysis-Apriori
67 pages
Apriori
No ratings yet
Apriori
27 pages
Association
No ratings yet
Association
29 pages
CA03CA3405Notes On Association Rule Mining and Apriori Algorithm
No ratings yet
CA03CA3405Notes On Association Rule Mining and Apriori Algorithm
41 pages
Mining Association Rules in Databases
No ratings yet
Mining Association Rules in Databases
77 pages
Efficient Pattern and Association Mining Analysis
No ratings yet
Efficient Pattern and Association Mining Analysis
14 pages
Unit 4 .3 Association Analysis
No ratings yet
Unit 4 .3 Association Analysis
50 pages
Data Analysis (No Free Launch Theorem)
No ratings yet
Data Analysis (No Free Launch Theorem)
8 pages
Association Rules
No ratings yet
Association Rules
39 pages
Data Mining & Association Rules
No ratings yet
Data Mining & Association Rules
39 pages
Association Rule Mining Techniques
No ratings yet
Association Rule Mining Techniques
108 pages
Unit4 1 Association Rules Apriori
No ratings yet
Unit4 1 Association Rules Apriori
23 pages
Association Rule Mining Spring 2022
No ratings yet
Association Rule Mining Spring 2022
84 pages
Unit - III
No ratings yet
Unit - III
27 pages
Mining Association Rules in Large Databases
No ratings yet
Mining Association Rules in Large Databases
40 pages
Association Rules
No ratings yet
Association Rules
48 pages
Retail Market Basket Analysis
No ratings yet
Retail Market Basket Analysis
43 pages
Association Rules & Clustering Techniques
No ratings yet
Association Rules & Clustering Techniques
13 pages
DWDM Unit IV Mining - FP Association Rules
No ratings yet
DWDM Unit IV Mining - FP Association Rules
82 pages
Association Rule Mining Techniques
No ratings yet
Association Rule Mining Techniques
11 pages
Association Rule-A Tool For Data Mining: Praveen Ranjan Srivastava
No ratings yet
Association Rule-A Tool For Data Mining: Praveen Ranjan Srivastava
6 pages
16-Efficient and Scalable Frequent Item Set Mining Methods - Apriori Algorithm-05-02-2025
No ratings yet
16-Efficient and Scalable Frequent Item Set Mining Methods - Apriori Algorithm-05-02-2025
37 pages
Association Rule Mining
No ratings yet
Association Rule Mining
17 pages
3final CH 5 Concept
No ratings yet
3final CH 5 Concept
101 pages
Mining: Association Rules
No ratings yet
Mining: Association Rules
54 pages
Chapter 3
No ratings yet
Chapter 3
27 pages
FALLSEM2022-23 SWE2009 ETH VL2022230101117 Reference Material I 25-08-2022 Frequent Pattern Mining
No ratings yet
FALLSEM2022-23 SWE2009 ETH VL2022230101117 Reference Material I 25-08-2022 Frequent Pattern Mining
42 pages
Computing Techniques-Continued: Association Rule Mining Clustering Time Series Analysis
No ratings yet
Computing Techniques-Continued: Association Rule Mining Clustering Time Series Analysis
174 pages
Lecture 6 - Other Data Science Tasks and Techniques
No ratings yet
Lecture 6 - Other Data Science Tasks and Techniques
60 pages
Unit 4 DWM by DR KSR Association - Analysis
No ratings yet
Unit 4 DWM by DR KSR Association - Analysis
68 pages
Data Mining for Computer Science Students
No ratings yet
Data Mining for Computer Science Students
45 pages
Grounded Theory
No ratings yet
Grounded Theory
19 pages
Disaster Nursing Competencies Framework
No ratings yet
Disaster Nursing Competencies Framework
27 pages
Marks Hi Marks: Be Comp MCQ PDF
100% (1)
Marks Hi Marks: Be Comp MCQ PDF
878 pages
Qualitative vs. Quantitative Research
No ratings yet
Qualitative vs. Quantitative Research
2 pages
BPSC 114
No ratings yet
BPSC 114
3 pages
SP96 Grading Class Participation
No ratings yet
SP96 Grading Class Participation
3 pages
High School Social Anxiety Coping
No ratings yet
High School Social Anxiety Coping
19 pages
Blue White Color Blocks Software Engineer CV
No ratings yet
Blue White Color Blocks Software Engineer CV
2 pages
Swahili For Dummies Okeno Instant Download
100% (1)
Swahili For Dummies Okeno Instant Download
131 pages
Thesis PPT Presentation Sample
No ratings yet
Thesis PPT Presentation Sample
47 pages
Cloud9 - Report - 27 05 2025 13 29 47
No ratings yet
Cloud9 - Report - 27 05 2025 13 29 47
1 page
Answer Sheet - Intro To Philosophy
No ratings yet
Answer Sheet - Intro To Philosophy
3 pages
Research Methods For Postgraduate Students
100% (1)
Research Methods For Postgraduate Students
128 pages
Salalac PPWNO.1
No ratings yet
Salalac PPWNO.1
14 pages
ESP Assignment Ans
No ratings yet
ESP Assignment Ans
23 pages
4-2 OB Syllabus
No ratings yet
4-2 OB Syllabus
1 page
Guidebook Wyiia 2025
No ratings yet
Guidebook Wyiia 2025
22 pages
Overview of Distributed Control Systems
No ratings yet
Overview of Distributed Control Systems
20 pages
Programming Languages Build Prove and Compare Norman Ramsey PDF Download
No ratings yet
Programming Languages Build Prove and Compare Norman Ramsey PDF Download
155 pages
Course Code: Course Title:: 23CS402 Algorithms of Internet
No ratings yet
Course Code: Course Title:: 23CS402 Algorithms of Internet
24 pages
Global Plan PYEP01
No ratings yet
Global Plan PYEP01
4 pages
UCSP Module 1 Quarter 1
86% (195)
UCSP Module 1 Quarter 1
21 pages
The Challenges of Military Veterans in
No ratings yet
The Challenges of Military Veterans in
24 pages
Development Economics Overview
No ratings yet
Development Economics Overview
24 pages
Aron, R. (1971) - Main Currents in Sociological Thought (Vol. 2) PDF
100% (2)
Aron, R. (1971) - Main Currents in Sociological Thought (Vol. 2) PDF
142 pages
Educational Leadership For Social Justice in Multicultural Contexts - The Case of Melilla, Spain
No ratings yet
Educational Leadership For Social Justice in Multicultural Contexts - The Case of Melilla, Spain
20 pages
Tsseg-Culture and Geography-Syllabus Ay 24-25
No ratings yet
Tsseg-Culture and Geography-Syllabus Ay 24-25
15 pages
DAA Worksheet Exp-2.1
No ratings yet
DAA Worksheet Exp-2.1
5 pages
ATL-Strategies Activities
No ratings yet
ATL-Strategies Activities
5 pages
Research Format
No ratings yet
Research Format
7 pages

Data Mining:: Association Rules Techniques

Uploaded by

Data Mining:: Association Rules Techniques

Uploaded by

Data Mining:

Association Rules Techniques

 Association rule mining:

sets of items or objects in transaction databases, relational

leader analysis, clustering, classification, etc.

 buys(x, “diapers”)  buys(x, “beers”) [0.5%, 60%]

 major(x, “CS”) ^ takes(x, “DB”) grade(x, “A”) [1%, 75%]

 Given: (1) database of transactions, (2) each transaction is a

accessories also get automotive services done

transaction contains {X, Y, Z}

Customer probability that a transaction

Transaction ID Items Bought Let minimum support 50%, and

 Boolean vs. quantitative associations (Based on the

buys(x, “DBMiner”) [0.2%, 60%]

“PC”) [1%, 75%]

Transaction ID Items Bought Min. support 50%

 Prune Step: Any (k-1)-itemset that is not frequent cannot be a subset of a

 Subjective measures (Silberschatz & Tuzhilin,

 actionable (the user can do something with it)

because the overall percentage of students eating cereal is 75%

accurate, although with lower support and confidence

 Association rule mining

You might also like