0% found this document useful (0 votes)
2 views37 pages

Module 3 Mining frequent patterns and associations

Market Basket Analysis (MBA) is a data mining technique that identifies patterns in customer purchasing behavior to optimize marketing strategies and store layouts. It uses metrics like support, confidence, and lift to analyze item relationships and is applicable in various sectors such as retail, healthcare, and fraud detection. Algorithms like Apriori and FP-Growth are employed to efficiently mine frequent item sets and association rules.

Uploaded by

shalom.lane
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views37 pages

Module 3 Mining frequent patterns and associations

Market Basket Analysis (MBA) is a data mining technique that identifies patterns in customer purchasing behavior to optimize marketing strategies and store layouts. It uses metrics like support, confidence, and lift to analyze item relationships and is applicable in various sectors such as retail, healthcare, and fraud detection. Algorithms like Apriori and FP-Growth are employed to efficiently mine frequent item sets and association rules.

Uploaded by

shalom.lane
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 37

EXAMPLE 2

Market Basket Analysis

• Market Basket Analysis (MBA) is a data mining technique used in retail and e-commerce to identify patterns in
customer purchasing behavior. It helps businesses understand which products are frequently bought together, allowing
them to optimize marketing strategies, cross-selling, and store layouts.
How Market Basket Analysis Works
Market Basket Analysis is based on association rule learning, a technique that discovers relationships between items in a
dataset. It primarily uses three key metrics:
1. Support:
Measures how frequently an item or item set appears in transactions.

2. Confidence
• Measures the likelihood that if a customer buys item A, they will also buy item B.
Lift
• Measures how much more likely item B is to be purchased when item A is bought, compared to random chance.
Applications of Market Basket Analysis
1. Retail & E-commerce
1. Recommending products based on frequently bought-together items.
2. Example: Amazon's "Customers who bought this also bought..."
2. Supermarket & Grocery Stores
1. Designing store layouts to place related products together.
2. Example: Placing bread next to butter and jam.
3. Fraud Detection
1. Identifying unusual purchasing patterns that may indicate fraud.
4. Healthcare & Medical Diagnosis
1. Analyzing patient records to identify common disease patterns.
5. Web & Content Recommendations
1. Suggesting related articles, videos, or online courses.
Algorithms for Market Basket Analysis
• Apriori Algorithm – Finds frequent item sets using a bottom-up approach.
• FP-Growth Algorithm – Uses a tree structure to find frequent patterns more efficiently.
• Eclat Algorithm – Uses depth-first search to find frequent item sets.
Frequent Item Sets

• A frequent item set is a collection of items that appear together in a dataset with a frequency higher than a predefined
threshold, called support. Frequent item sets are a key concept in Market Basket Analysis (MBA) and are used to
derive association rules that help in decision-making.
Closed item sets

• A closed item set is a frequent item set where none of its supersets (larger sets containing it) have the same
support. In simpler terms, a closed item set captures all important information about item frequency without
redundancy.
Why Use Closed Item Sets?
• Reduces the number of item sets without losing meaningful information.
• Speeds up association rule mining by eliminating unnecessary rules.
Association Rule
• Association Rule Mining is a machine learning technique used to discover relationships between items in large
datasets. It helps identify patterns, such as which products are often purchased together in retail or which
symptoms frequently co-occur in medical diagnosis.
An association rule is written in the form:
A⇒B where:
• A (Antecedent): The item(s) on the left side of the rule (e.g., "If a customer buys Milk").
• B (Consequent): The item(s) on the right side of the rule (e.g., "Then they also buy Bread").
Improving the efficiency of Apriori

• To improve Apriori’s efficiency, consider:


Reducing database scans (hashing, transaction reduction).
Reducing candidate sets (closed/maximal item sets).
Using alternative algorithms (FP-Growth).
Using parallel/distributed computing (Spark, Hadoop).
Dynamically adjusting support to optimize performance
Mining various kinds of association rules – Multilevel and Multidimensional
1. Multilevel Association Rules
Definition:
• Multilevel association rules are rules derived from data items that are organized in a hierarchy or taxonomy.
These hierarchies allow mining of rules at different levels of abstraction.
• Beverages
• ├── Soft Drinks
• │ ├── Coca-Cola
• │ └── Pepsi
• └── Juices
• ├── Orange Juice
• └── Apple Juice
Example Rules:
• Level 1 (general):
Beverages → Snacks
• Level 2 (intermediate):
Soft Drinks → Potato Chips
• Level 3 (specific):
Coca-Cola → Lays Chips
Techniques Used:
• Use of concept hierarchies or taxonomies.
• Lower support thresholds at deeper levels (specific items are rarer).
• Either top-down or bottom-up approaches:
• Top-down: Start from general rules and drill down.
• Bottom-up: Start from specific rules and generalize.
Applications:
• Retail market basket analysis
• Product recommendation systems
• Inventory management
2. Multidimensional Association Rules
Definition:
Multidimensional association rules involve multiple attributes (dimensions) in the rule, not just item sets. These could
include demographic, temporal, geographic, or behavioral attributes.
• Data Example:

Example Rules:
• [Age=20–30] ∧ [Gender=Male] → [Buys=Laptop]
• [Occupation=Teacher] ∧ [Buys=Books] → [Visits=Library]
Types of Multidimensional Rules:
• Intradimensional: All attributes from the same dimension (e.g., all products).
• Interdimensional: Attributes from different dimensions (e.g., age and product).
• Hybrid-dimensional: Mix of single-dimension and multi-dimension rules.
Techniques Used:
• Transformation of non-item data into item-like format (e.g., "Age=20-30" becomes an item).
• Use of cube-based mining or relational databases.
Applications:
• Customer segmentation
• Targeted marketing
• Fraud detection
• Healthcare analytics

You might also like