0% found this document useful (0 votes)
2 views16 pages

dwm module 1 (1.2)

The document compares different data warehouse schemas, focusing on Star Schema, Snowflake Schema, Factless Fact Tables, and Fact Constellation Schema. Star Schema features a central Fact Table with surrounding Dimension Tables for fast queries, while Snowflake Schema normalizes dimensions to reduce redundancy. Factless Fact Tables track events without numeric data, and Fact Constellation Schema combines multiple fact tables sharing dimensions for complex analysis.

Uploaded by

shalom.lane
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views16 pages

dwm module 1 (1.2)

The document compares different data warehouse schemas, focusing on Star Schema, Snowflake Schema, Factless Fact Tables, and Fact Constellation Schema. Star Schema features a central Fact Table with surrounding Dimension Tables for fast queries, while Snowflake Schema normalizes dimensions to reduce redundancy. Factless Fact Tables track events without numeric data, and Fact Constellation Schema combines multiple fact tables sharing dimensions for complex analysis.

Uploaded by

shalom.lane
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

1.

2 DWM MODULE 1

ER modelling Vs Dimensional
modelling

Data warehouse schemes

⭐ Star Schema – In-Depth Explanation in Short


🔷 What is Star Schema?
●​Star Schema ek simple aur fast data
warehouse schema hai.​
●​Ismein ek central Fact Table hoti hai, jo
numerical data store karti hai (like sales,
quantity),​
aur uske aas-paas multiple Dimension
Tables hoti hain (like Product, Customer,
Date),​
jo descriptive info rakhti hain.​

●​Design ka shape ek ⭐ star jaisa dikhta hai


— isliye naam "Star Schema".

🔸 Fact Table
Ye table store karti hai:
●​Numerical values (like sales amount,
quantity sold)​

●​Foreign keys pointing to dimension tables​

Example: Sales_Fact
text
CopyEdit
Date_ID | Product_ID |
Customer_ID | Store_ID | Revenue
| Quantity

🔹 Dimension Tables
Ye tables store karti hain descriptive details:

●​ Product_Dim: Product_ID, Product_Name, Category​

●​ Customer_Dim: Customer_ID, Name, Gender, Age​

●​ Date_Dim: Date_ID, Date, Month, Year​

●​ Store_Dim: Store_ID, Store_Name, Location


✅ Example – Grocery Store Analysis
Agar Reliance Smart ko ye dekhna hai ki:​
"January 2025 mein Mumbai Store mein sabse zyada kaunsa Product bika?"

→ Star schema se wo fast aur easily query kar sakta hai.

🔼 Advantages of Star Schema


✅ Advantage 🔍 Explanation
Simple Design Easy to understand for developers and
analysts.

Fast Query Performance Optimized for OLAP (analytical) queries.

Easy Joins Foreign keys directly link to dimensions –


fewer joins needed.

Denormalized Dimensions Data is repeated but faster to access.

Better Reporting Useful for tools like Power BI, Tableau,


etc.
Good for Aggregation & Ideal for dashboards and KPIs.
Summarization

🔽 Disadvantages of Star Schema


❌ Disadvantage 🔍 Explanation
Redundant Data Dimension tables are denormalized →
more storage.

Not Ideal for Complex Cannot handle many-to-many


Relationships relationships well.

Less Flexible for Modifications Updating dimension info can be harder


due to repetition.

Scalability Issues with Large Huge dimensions can affect performance


Dimensions and maintenance.

❄️ Snowflake Schema – In-Depth Explanation (Short & Clear)


🔷 What is Snowflake Schema?
●​ Snowflake schema ek data warehouse design hai jisme dimension tables
ko normalize kiya jata hai — yani unhe chhoti sub-tables mein tod diya
jata hai.​

●​ Ye schema data redundancy ko reduce karta hai aur data ko organized


banata hai.​

●​ Iska structure ek snowflake (बर्फ़ का फ़ूल) jaisa lagta hai — isliye naam
"snowflake schema"
🔢 1. Fact Table (Same as Star Schema):
●​ Yeh central table hoti hai.​

●​ Contains: Foreign Keys (dimension tables se link hone ke liye) +


Measurable Data (jaise revenue, sales).​

●​ Example: Sales_Fact table with columns:​

○​ Product_ID, Customer_ID, Date_ID, Store_ID,


Quantity, Revenue
✅ Advantages of Snowflake Schema:
✅ Advantage 🔍 Explanation
Data Duplication kam Normalization se repeated data hat jata hai.
hoti hai

Storage Efficient Space kam lagti hai due to non-redundant data.

Easy to Update Ek hi jagah change karne se sab jagah reflect


hota hai.

Supports Hierarchical Jo data tree structure mein hota hai (jaise


Data Category → Subcategory).
❌ Disadvantages of Snowflake Schema:
❌ Disadvantage 🔍 Explanation
Complex Joins Query likhne mein zyada joins lagte hain, difficult
hota hai.

Slow Performance Zyada joins hone ki wajah se query slow ho sakti


hai.

Not Beginner Business users ya analysts ke liye samajhna thoda


Friendly tough hota hai.

factless fact table

📘 Factless Fact Table – In-depth but Short Explanation


🔷 What is a Factless Fact Table?
Ye ek aisi fact table hoti hai jisme koi numeric/measurable data nahi
hota (jaise sales, revenue nahi hota).​
Is table ka kaam event tracking ya business condition ko represent
karna hota hai.

🧠 Simply:
"Ye batata hai kya hua ya kya nahi hua, par kitna hua ye nahi
batata."

🔑 Key Characteristics
Feature Description
❌ No Measures Only foreign keys to dimensions, no numbers
like sales amount.

🎯 Tracks Events or E.g., student attendance, product eligible for


Conditions promotion.

🔗 Foreign Keys Only Connects multiple dimensions like Student,


Class, Date, etc.

📋 Business Rule Helpful in validating or enforcing certain


Enforcement business policies.

🧩 Types of Factless Fact Tables


1️⃣ Event-Tracking Factless Fact Table

Jab kisi event ke hone ko record kiya jata hai.


✅ Example: Student Attendance
●​ Dimensions:​

○​ Student_Dim (Student_ID)​

○​ Class_Dim (Class_ID)​

○​ Date_Dim (Date_ID)​

●​ Fact Table: Attendance_Fact

🧠 Meaning: Student 101 ne Class 501 attend kiya on 1st May 2024.
2️⃣ Coverage Factless Fact Table

Jab kisi event ke nahi hone ko capture karte hain, jaise product
eligible tha but purchase nahi hua.
✅ Example: Promotion Eligibility
●​ Dimensions:​

○​ Product_Dim (Product_ID)​

○​ Store_Dim (Store_ID)​

○​ Promotion_Dim (Promotion_ID)​

●​ Fact Table: Promotion_Coverage_Fact

✅ Advantages of Factless Fact Tables


Advantage Explanation

🔍 Simple Design Just foreign keys – easy to design & understand.

📅 Event/Activity Ideal for tracking attendance, logins, views, etc.


Tracking

📐 Business Rule Helps validate business conditions like "who


Enforcement didn't purchase."
🧮 Can Derive Counts You can count how many events occurred, e.g.,
COUNT(*) = number of attendances.

❌ Disadvantages of Factless Fact Tables


Disadvantage Explanation

❌ No Quantitative Can’t analyze sales, revenue, etc. directly.


Analysis

❗ Confusing for Absence of numeric data may confuse new users.


Beginners

🔗 Heavily Relies on Need to join with dimensions for meaningful


Joins insights.

❓ Hard to Track Coverage type tables need special queries to find


Negatives “what didn’t happen.”

🎓 Real-Life Scenarios:
Use Case Factless Table
Type

📚 School attendance tracking Event-tracking

🛒 Products eligible for promotion Coverage

👨‍⚕️ Patient visit to clinic Event-tracking

👔 Employees eligible for bonus but Coverage


didn’t get

Fact constellation schema


🌌 What is Fact Constellation Schema? (aka Galaxy Schema)
Ye schema multiple fact tables ka collection hota hai jo common
dimension tables share karte hain.

🧠 Simply:
"Jab ek hi data warehouse me multiple star schemas hote hain jo
same dimension tables use karte hain — us design ko Fact
Constellation ya Galaxy schema kehte hain."

🧩 Key Characteristics
Feature Description

🌟 Multiple Fact Tables 2 or more fact tables present

🔁 Shared Dimension Tables Dimensions like Date, Product, Customer


are shared

🧱 Combines Multiple Star Each fact table with its own star schema
Schemas

🧠 Complex Schema More detailed and realistic than Star or


Snowflake

📊 Used in Large DW Used where business has many subject


areas

✅ Example of Fact Constellation Schema


🎯 Use Case: Retail Chain Business
💡 Facts:
1.​ Sales_Fact​

2.​ Shipping_Fact​

🧱 Common Dimensions:
●​ Product_Dim​

●​ Customer_Dim​

●​ Store_Dim​
●​ Date_Dim​

📊 Tables
Sales_Fact
Sale_ID Product_I Customer_ Store_I Date_I Amou
D ID D D nt

Shipping_Fact
Shipment_ID Product_I Store_I Date_I Shipping_C
D D D ost

Shared Dimensions:

●​ Product_Dim – info about product​

●​ Customer_Dim – info about buyer​

●​ Store_Dim – info about store location​

●​ Date_Dim – calendar info​

🧠 Explanation:
●​ Dono fact tables (Sales + Shipping) Product, Store, Date ko share karte
hain.​

●​ Sales_Fact track karta hai sales transactions.​

●​ Shipping_Fact track karta hai shipping cost and logistics info.​


✅ Advantages of Fact Constellation Schema
Advantage Explanation

🔄 Shared Dimensions Reusing dimensions avoids duplication


Save Space

🧠 Reflects Real Business Large businesses need multiple facts


Scenarios

📈 Supports Complex You can analyze data across multiple


Analysis domains (sales + shipping)

🔗 Flexible Schema Easy to add more fact tables

❌ Disadvantages of Fact Constellation Schema


Disadvantage Explanation

😵 Complex Design More difficult to design and


maintain

🐢 Slow Query Complex joins can slow down


Performance analysis

🧑‍💻 Requires Skilled Not beginner-friendly for analysts


Users

📐 Tool Support May Some BI tools prefer simpler star


Vary schemas

🧠 Simple Analogy to Remember


Star Schema – Ek star​

🌌
Snowflake Schema – Star ke andar branches​
Fact Constellation – Multiple stars = Galaxy

You might also like