0% found this document useful (0 votes)

21 views77 pages

DWM Module 1

Uploaded by

dhansanushree

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

21 views77 pages

DWM Module 1

Uploaded by

dhansanushree

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

Module 1: Data Warehousing Fundamentals

TE – D – DWM
Prof. Smruti Vyavahare
Assistant Professor
Dept. of Computer Engineering,
SIES Graduate School of Technology

1
Prof. Smruti Vyavahare
Contents

• Course Structure & Syllabus

• Study Material
• Course Objectives & Course Outcomes
• Introduction to Data Warehouse
• Data warehouse architecture
• Data warehouse versus Data Marts
• E-R Modeling versus Dimensional Modeling
• Information Package Diagram
• Data Warehouse Schemas; Star Schema, Snowflake Schema, Factless Fact Table, Fact Constellation Schema.
• Update to the dimension tables.
• Major steps in ETL process
• OLTP versus OLAP
• OLAP operations: Slice, Dice, Rollup, Drilldown and Pivot.

2
Prof. Smruti Vyavahare
Course Structure & Syllabus

3
Prof. Smruti Vyavahare
Course Structure & Syllabus

4
Prof. Smruti Vyavahare
Course Structure & Syllabus

5
Prof. Smruti Vyavahare
Course Structure & Syllabus

6
Prof. Smruti Vyavahare
Study Material

7
Prof. Smruti Vyavahare
Course Objectives & Course Outcomes

8
Prof. Smruti Vyavahare
9
Prof. Smruti Vyavahare
10
Prof. Smruti Vyavahare
11
Prof. Smruti Vyavahare
12
Prof. Smruti Vyavahare
13
Prof. Smruti Vyavahare
Feature Real-Time Application Example
Amazon uses its data warehouse to
Database designed for analytical tasks analyze customer buying patterns to
recommend products.
Flipkart pulls data from orders, customer
Data from multiple applications reviews, logistics, and support systems
into one warehouse.
Marketing managers at Swiggy explore
Easy to use and supports long interactive sessions customer order trends during campaigns to
measure success.

Banks use data warehouses to analyze

Read-intensive data usage
customer transactions for fraud detection.

HR managers at Infosys use dashboards

Direct user interaction without IT help to track employee performance without
needing IT support.
Big Bazaar loads sales data every night to
Content updated periodically and stable
analyze store performance.
Netflix stores historical viewing data to
Includes current and historical data understand viewer behavior and improve
recommendations.
Zomato can quickly analyze how many
Ability to run queries and get results online orders were placed by city and time slot
during peak hours.
Sales teams at Tata Motors can generate
Users can initiate reports region-wise sales reports without waiting
for IT teams.
14
Prof. Smruti Vyavahare
15
Prof. Smruti Vyavahare
1. Provides an integrated and total view of the enterprise
Combines data from different departments (sales, HR, finance, etc.) into one place.
Example: A company like Tata Consultancy Services (TCS) uses a data warehouse to see
complete data across global offices in one unified dashboard.

2. Makes the enterprise’s current and historical information easily available for strategic
decision-making
Stores both recent and past data to help leaders make informed choices.
Example: Nestlé uses historical sales data to decide which products to promote during festive
seasons.

3. Makes decision-support transactions possible without hindering operational systems

Allows deep data analysis without slowing down daily operations (like order processing).
Example: While Flipkart runs millions of daily transactions, its warehouse separately analyzes
data for demand forecasting.

4. Renders the organization’s information consistent

Data from different systems is cleaned and standardized for accuracy and consistency.
Example: If the HR system uses “M” and “Male,” both are treated uniformly in the warehouse to
maintain consistency.

5. Presents a flexible and interactive source of strategic information

Users can query the system, generate reports, or perform data visualization easily.
Example: Marketing analysts at Zomato use dashboards to track user behavior and food trends
interactively.
16
Prof. Smruti Vyavahare
A Data Warehouse is a central repository where data from multiple sources is stored, integrated, and
analyzed to support decision-making.

Processing Requirements in the New System

1. Running of simple queries and reports against current and historical data:
Users should be able to easily retrieve both recent and old data to generate insights or reports.
Example: Sales trends over the last 5 years.

2. Ability to perform “what if” analysis in many different ways:

Users can simulate different scenarios.
Example: “What if we increase product prices by 10%?”

3. Ability to query, step back, analyze, and then continue the process to any desired length:
The system supports interactive exploration of data — users can drill down, go back, and
explore further.
Useful for deep data analysis.

4. Ability to spot historical trends and apply them in future interactive processes:
The system identifies patterns over time (e.g., seasonal trends) and uses them in future
planning or decision models.

“This new environment is known as the Data Warehouse environment”

17
Prof. Smruti Vyavahare
Definition of Data Warehouse
"A Data Warehouse is a subject oriented, integrated, nonvolatile, and time variant collection of data in
support of management’s decisions."
— Bill Inmon (Father of Data Warehousing)

Features of Data Warehouse

1. Subject-Oriented Data
Data is organized around key business subjects or areas like sales, finance, customer, product a data
warehouse focuses on analyzing specific topics for decision-making.
Example: All data related to customers — their purchases, feedback, history — is grouped for analysis.

2. Integrated Data
Data from multiple sources (like ERP, CRM, spreadsheets) is cleaned, transformed, and stored in a
unified format.
Example: Customer IDs from different departments are standardized into one format.

3. Time-Variant Data
Data is stored with a time dimension (daily, monthly, yearly) to allow analysis over a long period.
It enables trend analysis, forecasting, and comparisons.
Example: Sales reports over the past 5 years can be compared to identify seasonal trends.

18
Prof. Smruti Vyavahare
4. Non-Volatile Data
Once data enters the warehouse, it is not frequently updated or deleted.
Example: A sales transaction from 2022 remains unchanged and available for future analysis.

5. Data Granularity
It refers to the level of detail of the data — from highly detailed (transaction-level) to summarized
(monthly or yearly reports).
Data warehouses usually store data at multiple levels of granularity.
Example: You can view both individual customer purchases and total monthly revenue.

19
Prof. Smruti Vyavahare
20
Prof. Smruti Vyavahare
1. Data Sources
It including:
Operational Systems: Databases used in day-to-day business operations like ERP, CRM, billing systems, etc
Flat Files: Data stored in spreadsheets, CSV files, logs, etc.
These systems are typically OLTP (Online Transaction Processing) systems.

2. Staging Area
This is where ETL (Extract, Transform, Load) happens:
Extract data from sources.
Transform data (cleaning, formatting, merging, etc.).
Load the processed data into the data warehouse.
This stage ensures data quality and consistency before it's stored permanently.

3. Warehouse
The central component where data is stored and managed. It includes:
Metadata: Data about the data (e.g., source, time stamp, format).
Summary Data: Aggregated or pre-calculated data (e.g., monthly sales totals).
Raw Data: Detailed data from the source systems (e.g., individual transactions).
This enables both high-level analysis and deep dives into raw data.

4. Users
These are the end-users who consume the data using different tools for various purposes:
Analytics: Advanced analysis, dashboards, data visualization.
Reporting: Standardized reports for business operations.
Mining: Data mining techniques to discover patterns, trends, and insights.
These outputs support decision-making at tactical and strategic levels. 21
Prof. Smruti Vyavahare
22
Prof. Smruti Vyavahare
1. Bottom Tier: Data Source Layer
This is where the data originates from.
• Operational Databases: These are the existing databases used for day-to-day business operations
(e.g., ERP, CRM).
• External Sources: These can be flat files, cloud data, or third-party data providers.
• ETL Process (Extract, Clean, Transform, Load, Refresh):
Extract: Pulls raw data from operational and external sources.
Clean: Removes errors and inconsistencies.
Transform: Converts data into a usable format.
Load: Transfers the transformed data into the data warehouse.
Refresh: Updates the data periodically.

2. Middle Tier: OLAP Server

This tier acts as an interface between stored data and end-user tools.
• Data Warehouse: The central repository where cleaned and transformed data is stored.
• Data Marts: Subsets of the data warehouse, created for specific departments (e.g., sales, finance)
• OLAP Servers (Online Analytical Processing): Used for fast querying, multidimensional analysis, and
summarizing large datasets.
• Metadata Repository: Stores information about data (such as source, structure, transformation rules)
and is used for monitoring and administration.

3. Top Tier: Front-End Tools

This tier allows users to access and analyze the data.
• Query/Report Tools: For generating reports and querying the database.
• Analysis Tools: Visual dashboards and data visualization tools.
• Data Mining Tools: For discovering hidden patterns and insights from large datasets.
23
Prof. Smruti Vyavahare
Data Mart

A Data Mart is essentially a smaller, more focused version of a data warehouse. It’s tailored to meet the
needs of a specific group within an organization.
Key Characteristics:
Subset of a Data Warehouse: It contains a portion of the data warehouse, usually relevant to a specific
business area.
Department-Specific: Designed for use by a particular department like Marketing, Sales, HR, or
Finance.
Department-Controlled: Typically managed and maintained by the department that uses it.
Fewer Data Sources: Pulls data from a limited number of sources, unlike a full data warehouse which
integrates data from across the organization.
Smaller and More Agile: Because of its limited scope, it’s easier to manage and adapt to changes.

24
Prof. Smruti Vyavahare
25
Prof. Smruti Vyavahare
Data warehousing design strategies or Approaches for building data warehouse

Top-Down Approach
The Top-Down Approach, introduced by Bill Inmon, is a method for designing data warehouses that starts
by building a centralized, company-wide data warehouse. It ensures data consistency and provides a strong
foundation for decision-making.
Working of Top-Down Approach
Central Data Warehouse: The process begins with creating a comprehensive data warehouse where data
from various sources is collected, integrated, and stored. This involves the ETL (Extract, Transform, Load)
process to clean and transform the data.
Specialized Data Marts: Once the central warehouse is established, smaller, department-specific data marts
(e.g., for finance or marketing) are built. These data marts pull information from the main data warehouse,
ensuring consistency across departments.

26
Prof. Smruti Vyavahare
Bottom-Up Approach
The Bottom-Up Approach, popularized by Ralph Kimball, takes a more flexible and incremental path to
designing data warehouses. Instead of starting with a central data warehouse, it begins by building small,
department-specific data marts that cater to the immediate needs of individual teams, such as sales or finance.
These data marts are later integrated to form a larger, unified data warehouse.

Working of Bottom-Up Approach

• Department-Specific Data Marts: The process starts with creating data marts for individual departments
or specific business functions. These data marts are designed to meet immediate data analysis and reporting
needs, allowing departments to gain quick insights.

• Integration into a Data Warehouse: Over time, these data marts are connected and consolidated to create
a unified data warehouse. The integration ensures consistency and provides a comprehensive view of the
organization’s data.

27
Prof. Smruti Vyavahare
Difference between Top-Down Approach & Bottom-Up Approach

Feature Top-Down (Inmon) Bottom-Up (Kimball)

First Step Build EDW Build Data Marts

Time to Implement Longer Faster

Integration Level High Medium

Initial Cost High Low

Suitable For Strategic enterprise vision Departmental focus

28
Prof. Smruti Vyavahare
Metadata
Metadata is "data about the data". In the context of a data warehouse, it provides descriptive information
about the warehouse's data contents, structure, source, and processes.
1. Metadata means "data about the data“:
Metadata gives context to raw data. It explains things like:
• What each column means (e.g., “DOB” = Date of Birth)
• Where the data comes from
• What format it is in (e.g., date, number, text)
• How often it is updated

2. Yellow Pages Analogy:

• Yellow Pages tells what business is where and what it does. Similarly, metadata tells you what data
is where in the data warehouse and what it means.

3. Directory Function
• Metadata helps navigate, access, and manage the contents of a data warehouse, especially useful
when data volume is large and complex.

4. Architectural Role:
• Metadata is a core architectural component of a data warehouse—it supports data integration, access,
governance, and quality control.

29
Prof. Smruti Vyavahare
Types of Metadata
1. Operational Metadata
It contains information about the operational data sources, the systems that provide data to the warehouse.
The examples are:
• File names and formats (CSV, Excel, etc.)
• Data refresh schedules
• Load times and history logs
Helps in tracking and managing the source data and its movement into the warehouse.
2. Extraction and Transformation Metadata
It includes :
• Data extraction from source systems: how, when, and how often data is pulled.
• Business rules applied during extraction.
• Data transformations performed before storing in the warehouse.
The examples are:
• Extraction methods (API, ETL tools)
• Frequency (daily, hourly, real-time)
• Rules like converting formats, removing duplicates, or merging tables
It helps in documenting the entire ETL (Extract, Transform, Load) process for auditing,
debugging, and ensuring data accuracy.
3. End-user Metadata
This is the user-friendly metadata that acts like a "navigational map" of the data warehouse for business users.
The examples are:
• Data dictionary (table/column names with business descriptions)
• Report labels and definitions
• Tooltips in dashboards
It enables end users (like analysts or managers) to easily find, understand, and use data for decision-
making.
30
Prof. Smruti Vyavahare
ER-Modelling VS Dimensional-Modelling

1. Usage
• E-R Modelling:
1. Supports OLTP – used for daily transactions like banking, bookings, etc.
2. It focuses on data integrity and efficiency in updating data.
• Dimensional Modelling:
1. Supports OLAP – used for data analysis, business intelligence, reporting.
2. Optimized for data retrieval and queries.

2. Structure
• E-R Modelling:
1. Entities (tables) are connected using multiple joins (often normalized).
• Dimensional Modelling:
Still uses joins, but mostly between fact and dimension tables in a simpler way
(star/snowflake schema).

3. Data Organization
• E-R Modelling:
Normalized: Data is split into smaller tables to reduce redundancy.
• Dimensional Modelling:
Denormalized: Data is often repeated to simplify queries and improve performance.

31
Prof. Smruti Vyavahare
4. Redundancy
• E-R Modelling:
Tries to remove redundancy to ensure consistency.
• Dimensional Modelling:
Allows redundancy for faster querying and simplicity.

5. Modification Impact
• E-R Modelling:
If change the model, it often affects the application using it.
• Dimensional Modelling:
More flexible and extensible->can handle new data elements easily without breaking
existing designs.

6. Adaptability
• E-R Modelling:
Can be fragile if query patterns change—structure is complex.
• Dimensional Modelling:
More robust—design is stable even if reporting/query needs evolve.

7. Complexity for Users

• E-R Modelling:
Hard to understand for business users; complex joins and structure.
• Dimensional Modelling:
Easy and user-friendly, especially for analysts and decision-makers.

32
Prof. Smruti Vyavahare
Dimensional-Modelling
Dimensional modelling is a data structure technique used to design databases that support efficient
querying and reporting. It’s especially used in data warehouses.

1. Optimized for Data Warehousing

• Data structure technique optimized for data storage in a Data Warehouse.
• It is designed to store historical and analytical data that supports decision-making, not real-time
updates.

2. Faster Data Retrieval

• Optimizes the database for faster retrieval of data.
• It organizes data in a denormalized structure (fewer joins), making it faster to query, especially for
large datasets.
• Beneficial for dashboards, and data analysis.

3. Designed for Analytical Use

• A dimensional model in data warehouse is designed to read, summarize, analyze numeric information.
• It’s suitable for:
1. Reading and aggregating data (e.g., total sales by month)
2. Summarizing across dimensions (e.g., sales by region, by product)
3. Analyzing trends (e.g., monthly performance)

33
Prof. Smruti Vyavahare
4. Developed by Ralph Kimball
• Developed by Ralph Kimball and it consists of 'fact' and 'dimension' tables.
• Fact Tables: Store measurable data (e.g., sales, profit, quantity).
• Dimension Tables: Store context about facts (e.g., customer, time, product, region).

Example (Retail Data Warehouse):

Sales_Fact –> sale_id, product_id,

Fact Table
date_id, total_amount

Product_Dim, Date_Dim,
Dimension Table
Customer_Dim

34
Prof. Smruti Vyavahare
35
Prof. Smruti Vyavahare
36
Prof. Smruti Vyavahare
37
Prof. Smruti Vyavahare
Information Package Diagram (IPD)

1. An Information Package Diagram (IPD) is a logical design tool used in dimensional modeling
to:
• Identify business dimensions (descriptive data)
• Identify facts or metrics (numerical data)
It is the foundation step in designing a data warehouse schema like star schema or snowflake
schema.

2. It helps structure the dimensions (e.g., time, customer, product) and the metrics/facts (e.g., sales,
quantity, profit) that are to be analyzed.

3. Before designing database tables, use an IPD to plan out.

4. IPD defines the structure of a dimensional model.

5. Every IPD includes measurable metrics (facts) alongside their corresponding dimensions.

38
Prof. Smruti Vyavahare
39
Prof. Smruti Vyavahare
40
Prof. Smruti Vyavahare
41
Prof. Smruti Vyavahare
42
Prof. Smruti Vyavahare
43
Prof. Smruti Vyavahare
44
Prof. Smruti Vyavahare
1. Star Schema
• Central fact table connects directly to several dimension tables.
• Dimension tables are denormalized (flat, no sub-tables).
• Simple and fast to query.
Mainly used for: Quick report generation.
Example are:
• Fact table: Sales_Fact (sale_id, date_id, product_id, amount)
• Dimensions: Product_Dim, Customer_Dim, Date_Dim

2. Snowflake Schema
• Extension of the star schema.
• Dimension tables are normalized (split into sub-tables).
• Uses less storage but is slower to query.
Mainly used for: Saving space and reducing redundancy.
Example are:
• Product_Dim might link to Category_Dim and Supplier_Dim.

3. Fact Constellation Schema (Galaxy Schema)

• Contains multiple fact tables that share common dimension tables.
• Useful for complex systems with more than one business process.
Mainly used for: Large enterprises with varied operations.
Example are:
• Fact tables: Sales_Fact, Returns_Fact
• Shared dimensions: Customer_Dim, Date_Dim

45
Prof. Smruti Vyavahare
46
Prof. Smruti Vyavahare
47
Prof. Smruti Vyavahare
48
Prof. Smruti Vyavahare
49
Prof. Smruti Vyavahare
50
Prof. Smruti Vyavahare
51
Prof. Smruti Vyavahare
52
Prof. Smruti Vyavahare
53
Prof. Smruti Vyavahare
54
Prof. Smruti Vyavahare
55
Prof. Smruti Vyavahare
56
Prof. Smruti Vyavahare
57
Prof. Smruti Vyavahare
58
Prof. Smruti Vyavahare
59
Prof. Smruti Vyavahare
60
Prof. Smruti Vyavahare
61
Prof. Smruti Vyavahare
62
Prof. Smruti Vyavahare
63
Prof. Smruti Vyavahare
64
Prof. Smruti Vyavahare
65
Prof. Smruti Vyavahare
66
Prof. Smruti Vyavahare
67
Prof. Smruti Vyavahare
Major Steps in ETL Process

68
Prof. Smruti Vyavahare
Major Steps in ETL Process

69
Prof. Smruti Vyavahare
70
Prof. Smruti Vyavahare
OLAP Operations
OLAP stands for Online Analytical Processing Server. It is a software technology that allows users to
analyze information from multiple database systems at the same time. It is based on multidimensional data
model and allows the user to query on multi-dimensional data (eg. Delhi -> 2018 -> Sales data). OLAP
databases are divided into one or more cubes and these cubes are known as Hyper-cubes.

71
Prof. Smruti Vyavahare
1. Drill down: In drill-down operation, the less detailed data is converted into highly detailed data. It can
be done by:

• Moving down in the concept hierarchy

• Adding a new dimension

In the cube given in overview section, the drill down operation is performed by moving down in the
concept hierarchy of Time dimension (Quarter -> Month).

72
Prof. Smruti Vyavahare
2. Roll up: It is just opposite of the drill-down operation. It performs aggregation on the OLAP cube. It can
be done by:

• Climbing up in the concept hierarchy

• Reducing the dimensions

In the cube given in the overview section, the roll-up operation is performed by climbing up in the concept
hierarchy of Location dimension (City -> Country).

73
Prof. Smruti Vyavahare
3. Dice: It selects a sub-cube from the OLAP cube by selecting two or more dimensions. In the cube
given in the overview section, a sub-cube is selected by selecting following dimensions with criteria:
Location = "Delhi" or "Kolkata"

• Time = "Q1" or "Q2"

• Item = "Car" or "Bus"

74
Prof. Smruti Vyavahare
4. Slice: It selects a single dimension from the OLAP cube which results in a new sub-cube creation. In
the cube given in the overview section, Slice is performed on the dimension Time = "Q1".

5. Pivot: It is also known as rotation operation as it rotates the current view to get a new view of the
representation. In the sub-cube obtained after the slice operation, performing pivot operation gives a new
view of it.

75
Prof. Smruti Vyavahare
76
Prof. Smruti Vyavahare
Thank You!
([email protected])

77
Prof. Smruti Vyavahare

DWM Module 1
No ratings yet
DWM Module 1
104 pages
Data Warehousing for Honours Students
No ratings yet
Data Warehousing for Honours Students
47 pages
DW Unit I Notes
No ratings yet
DW Unit I Notes
28 pages
CS 2208 Data Mining and Warehousing Notes
No ratings yet
CS 2208 Data Mining and Warehousing Notes
14 pages
1a Ravi
No ratings yet
1a Ravi
37 pages
DWDM202
No ratings yet
DWDM202
6 pages
DWDM U-1
No ratings yet
DWDM U-1
8 pages
Data Warehouse: Key Concepts & Architecture
No ratings yet
Data Warehouse: Key Concepts & Architecture
30 pages
Advanced Database Presentation
No ratings yet
Advanced Database Presentation
11 pages
Data Warehousing
No ratings yet
Data Warehousing
4 pages
Introduction To Data Warehousing - Overview
No ratings yet
Introduction To Data Warehousing - Overview
21 pages
Lec09-Data Warehousing
No ratings yet
Lec09-Data Warehousing
32 pages
WA Data Warehouse Overview and Benefits
No ratings yet
WA Data Warehouse Overview and Benefits
16 pages
BA Unit2 Own
No ratings yet
BA Unit2 Own
10 pages
Business Intelligence?: BI Used For?
No ratings yet
Business Intelligence?: BI Used For?
9 pages
Antim Prahar Business Data Warehousing Data Mining 2024
No ratings yet
Antim Prahar Business Data Warehousing Data Mining 2024
65 pages
Unit II Notes
No ratings yet
Unit II Notes
64 pages
DW Final
No ratings yet
DW Final
38 pages
Datawarehouse Unit-2
No ratings yet
Datawarehouse Unit-2
59 pages
Data Mining
No ratings yet
Data Mining
142 pages
BI Module 3
No ratings yet
BI Module 3
10 pages
Unit 1
No ratings yet
Unit 1
22 pages
Data Warehousing
No ratings yet
Data Warehousing
2 pages
Mis Presentation
No ratings yet
Mis Presentation
23 pages
Data Warehousing and Mining Guide
No ratings yet
Data Warehousing and Mining Guide
46 pages
Lecture 03
No ratings yet
Lecture 03
24 pages
DWDM Fresh Notes For Unit 1, Unit 2, Unit 3
No ratings yet
DWDM Fresh Notes For Unit 1, Unit 2, Unit 3
54 pages
Lec 11 - DW
No ratings yet
Lec 11 - DW
32 pages
All About Data Warehouse
No ratings yet
All About Data Warehouse
35 pages
DWM Gufran Notes
No ratings yet
DWM Gufran Notes
318 pages
DWDM
No ratings yet
DWDM
61 pages
ISDM Group5 Review
No ratings yet
ISDM Group5 Review
23 pages
CCS341-Data Warehousing Notes-Unit I
No ratings yet
CCS341-Data Warehousing Notes-Unit I
30 pages
Data Notes
No ratings yet
Data Notes
37 pages
Understanding Data Warehousing Concepts
No ratings yet
Understanding Data Warehousing Concepts
29 pages
Chapter1 Data Warehousing Intro
No ratings yet
Chapter1 Data Warehousing Intro
48 pages
DM 5th Sem Unit-1
No ratings yet
DM 5th Sem Unit-1
8 pages
Ch4 DW Detailed Version
No ratings yet
Ch4 DW Detailed Version
39 pages
Data Warehouse Notes
No ratings yet
Data Warehouse Notes
41 pages
Unit Ii
No ratings yet
Unit Ii
45 pages
DW Unit 1
No ratings yet
DW Unit 1
29 pages
Unit - 2
No ratings yet
Unit - 2
116 pages
Why We Need Data Warehouse?
No ratings yet
Why We Need Data Warehouse?
4 pages
Data Warehousing
No ratings yet
Data Warehousing
8 pages
Data Warehousing Basics & Components
No ratings yet
Data Warehousing Basics & Components
37 pages
U - 1 I D W: NIT Ntroduction To ATA Arehousing
No ratings yet
U - 1 I D W: NIT Ntroduction To ATA Arehousing
134 pages
Module 1-3
No ratings yet
Module 1-3
18 pages
DWDM Unit 1 (R23)
No ratings yet
DWDM Unit 1 (R23)
85 pages
DWM Unit 1. Introduction To Data Warehousing
100% (4)
DWM Unit 1. Introduction To Data Warehousing
12 pages
Business Analytics Unit 2 Notes
No ratings yet
Business Analytics Unit 2 Notes
30 pages
DMT Unit-1
No ratings yet
DMT Unit-1
59 pages
Data Warehousing
No ratings yet
Data Warehousing
2 pages
MIS - Session 11-14 - BI Data Warehouse
No ratings yet
MIS - Session 11-14 - BI Data Warehouse
65 pages
Unit 1 NMP-1
No ratings yet
Unit 1 NMP-1
33 pages
Data Ware House
No ratings yet
Data Ware House
25 pages
Data Warehouse
No ratings yet
Data Warehouse
68 pages
Final Book of Abstracts
No ratings yet
Final Book of Abstracts
346 pages
Greedy Coloring Algorithm (Recap)
No ratings yet
Greedy Coloring Algorithm (Recap)
2 pages
LegalBuddy AI
No ratings yet
LegalBuddy AI
4 pages
Understanding Decision Problems in Graphs
No ratings yet
Understanding Decision Problems in Graphs
1 page
Aoa Ia2 Rough
No ratings yet
Aoa Ia2 Rough
4 pages
MP IA2 Rough
No ratings yet
MP IA2 Rough
4 pages
MP IA2 Rough
No ratings yet
MP IA2 Rough
2 pages
Combinatorial and Graph Theory Problems
No ratings yet
Combinatorial and Graph Theory Problems
5 pages
DLCOA Rough
No ratings yet
DLCOA Rough
5 pages
Information Organization
No ratings yet
Information Organization
7 pages
Excel Pivot Tables
No ratings yet
Excel Pivot Tables
10 pages
Information Systems Classification Overview
No ratings yet
Information Systems Classification Overview
21 pages
SEO Internal Linking Guide
No ratings yet
SEO Internal Linking Guide
27 pages
Information Retrieval Techniques
No ratings yet
Information Retrieval Techniques
59 pages
Aeo+SEO 2025
No ratings yet
Aeo+SEO 2025
13 pages
Overview of Database Management Systems
No ratings yet
Overview of Database Management Systems
3 pages
Lecture 13 - Creating Maps Dec 2016
No ratings yet
Lecture 13 - Creating Maps Dec 2016
16 pages
Unit - 2 - Computerized Accounting System - Notes
No ratings yet
Unit - 2 - Computerized Accounting System - Notes
11 pages
Fundamentals of Data Analytics
No ratings yet
Fundamentals of Data Analytics
39 pages
Resume Muthu Kamalam
No ratings yet
Resume Muthu Kamalam
4 pages
Lab Report 03
No ratings yet
Lab Report 03
18 pages
Auditing Final Exam in CIS Environment
No ratings yet
Auditing Final Exam in CIS Environment
4 pages
A-Level Database Concepts
No ratings yet
A-Level Database Concepts
55 pages
Systems Analysis & Design Course Guide
No ratings yet
Systems Analysis & Design Course Guide
1 page
Cosmetic Shop Automation
No ratings yet
Cosmetic Shop Automation
14 pages
Database Design for Students
0% (3)
Database Design for Students
2 pages
Database Systems Course Overview
No ratings yet
Database Systems Course Overview
3 pages
Ai Final Project
No ratings yet
Ai Final Project
28 pages
DESIGN - AND - IMPLEMENTATION - OF - AN - ONLINE - LIBRARY - SYSTEM Chap 1-2
No ratings yet
DESIGN - AND - IMPLEMENTATION - OF - AN - ONLINE - LIBRARY - SYSTEM Chap 1-2
9 pages
Advanced Database Chapter 7 Assignment PDF
No ratings yet
Advanced Database Chapter 7 Assignment PDF
7 pages
How To Write A Bibliography
No ratings yet
How To Write A Bibliography
12 pages
Belanger 5e PPT Ch01
No ratings yet
Belanger 5e PPT Ch01
36 pages
Ch17 - Backup and Recovery Concepts1
No ratings yet
Ch17 - Backup and Recovery Concepts1
23 pages
SQL SELECT Statement Test
No ratings yet
SQL SELECT Statement Test
7 pages
MAN2130: Technology, Media & Data: Week 3: Business Analytics & Intelligence Yoo Ri Kim Yoori - Kim@surrey - Ac.uk
No ratings yet
MAN2130: Technology, Media & Data: Week 3: Business Analytics & Intelligence Yoo Ri Kim Yoori - Kim@surrey - Ac.uk
46 pages
AdventureWorks Entity Relationship Diagram
No ratings yet
AdventureWorks Entity Relationship Diagram
1 page
Reflecting On AI & Salesforce For 2024
No ratings yet
Reflecting On AI & Salesforce For 2024
2 pages
Technological Forecasting & Social Change: Canio Forliano, Paola de Bernardi, Dorra Yahiaoui
No ratings yet
Technological Forecasting & Social Change: Canio Forliano, Paola de Bernardi, Dorra Yahiaoui
16 pages
Harshit AI ML Engineer
No ratings yet
Harshit AI ML Engineer
4 pages

DWM Module 1

Uploaded by

DWM Module 1

Uploaded by

Module 1: Data Warehousing Fundamentals

• Course Structure & Syllabus

Banks use data warehouses to analyze

HR managers at Infosys use dashboards

3. Makes decision-support transactions possible without hindering operational systems

4. Renders the organization’s information consistent

5. Presents a flexible and interactive source of strategic information

Processing Requirements in the New System

2. Ability to perform “what if” analysis in many different ways:

“This new environment is known as the Data Warehouse environment”

Features of Data Warehouse

2. Middle Tier: OLAP Server

3. Top Tier: Front-End Tools

Working of Bottom-Up Approach

Feature Top-Down (Inmon) Bottom-Up (Kimball)

First Step Build EDW Build Data Marts

Time to Implement Longer Faster

Integration Level High Medium

Initial Cost High Low

Suitable For Strategic enterprise vision Departmental focus

2. Yellow Pages Analogy:

7. Complexity for Users

1. Optimized for Data Warehousing

2. Faster Data Retrieval

3. Designed for Analytical Use

Example (Retail Data Warehouse):

Sales_Fact –> sale_id, product_id,

3. Before designing database tables, use an IPD to plan out.

4. IPD defines the structure of a dimensional model.

3. Fact Constellation Schema (Galaxy Schema)

• Moving down in the concept hierarchy

• Adding a new dimension

• Climbing up in the concept hierarchy

• Reducing the dimensions

• Time = "Q1" or "Q2"

• Item = "Car" or "Bus"

You might also like