0% found this document useful (0 votes)
29 views30 pages

By - Shaik Yasir Ahmed

1. A data warehouse is a collection of data from various sources organized for analysis and reporting. It integrates data across an organization for decision making. 2. Data warehouses help businesses gain insights into areas like top customers, products, sales channels and the impact of promotions through analysis of historical data. 3. Data mining works with warehouse data to extract patterns and intelligence that help answer questions like predicting customer loyalty and behavior.

Uploaded by

venusuri
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views30 pages

By - Shaik Yasir Ahmed

1. A data warehouse is a collection of data from various sources organized for analysis and reporting. It integrates data across an organization for decision making. 2. Data warehouses help businesses gain insights into areas like top customers, products, sales channels and the impact of promotions through analysis of historical data. 3. Data mining works with warehouse data to extract patterns and intelligence that help answer questions like predicting customer loyalty and behavior.

Uploaded by

venusuri
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 30

By - Shaik Yasir Ahmed

Raugh kimball –
In simplest terms Data Warehouse can be defined as collection of Data marts.
-Data marts : Subjective collection of Data.
Bill Inmon –
A data warehouse is a “subject-oriented, integrated, timevariant,and nonvolatile”
collection of data in support of management’s decision-making process.”

ERP
will Run the Business
- like how Tyres Run the Car
BI (Reports,Data mining,Dashboards,kpi’s)
will help you to take business decisions based
on your historical data.
- like Steering, mirrors, breaks,
dashboards will help, how smoothly you can run
the Car or reach the Destination.
In What way a Data warehouse helps any Business
Let’s say A producer wants to know….

Which
Whichare
areour
our
lowest/highest
lowest/highestmargin
margin
customers
customers??
Who
Whoare
aremy
mycustomers
customers
What and
andwhat
whatproducts
Whatisisthe
themost
most products
effective are
arethey
theybuying?
effectivedistribution
distribution buying?
channel?
channel?

What
Whatproduct
productprom- Which
prom- Whichcustomers
customers
-otions
-otionshave
havethe
thebiggest are
biggest aremost
mostlikely
likelyto
togo
go
impact
impacton
onrevenue? to
revenue? tothe
thecompetition
competition??
What
Whatimpact
impactwill
will
new
newproducts/services
products/services
have
haveon
onrevenue
revenue
and
andmargins?
margins?
4
Data, Data everywhere yet ...
• I can’t find the data I need
– data is scattered over the network
– many versions, subtle differences

• I can’t get the data I need


• need an expert to get the data

• I can’t understand the data I found


• available data poorly documented

• I can’t use the data I found


• results are unexpected
• data needs to be transformed from one
form to other 5
A single, complete and
consistent store of data
obtained from a variety of
different sources made available
to end users in a what they can
understand and use in a
business context.

[Barry Devlin]

6
What are the users saying...
•Data should be integrated across
the enterprise
•Summary data has a real value to
the organization
•Historical data holds the key to
understanding data over time
•What-if capabilities are required

7
A process of transforming
Information
data into information and
making it available to users
in a timely enough manner
to make a difference

[Forrester Research, April 1996]

Data
8
Data Warehousing --
It is a process
• Technique for assembling and
managing data from various sources
for the purpose of answering
business questions. Thus making
decisions that were not previous
possible
• A decision support database
maintained separately from the
organization’s operational database

9
Data Mining works with
Warehouse Data
Data Warehousing provides the
Enterprise with a memory

Data Mining provides the


Enterprise with intelligence

10
We want to know ...
•Given a database of 100,000 names, which persons are the least likely to
default on their credit cards?
•Which types of transactions are likely to be fraudulent given the
demographics and transactional history of a particular customer?
•If I raise the price of my product by Rs. 2, what is the effect on my ROI?
•If I offer only 2,500 airline miles as an incentive to purchase rather than
5,000, how many lost responses will result?
•If I emphasize ease-of-use of the product as opposed to its technical
capabilities, what will be the net effect on my revenues?
•Which of my customers are likely to be the most loyal?

Data Mining helps to extract such information


Oracle 10g
IBM DB2

Base Product
$ 25K $ 40K $ 25K
Tuning
$3K
Diagnostics
$3K
Partitioning Performance
$10K Expert
(included)
$10K

Manageability

Base Product
$ 25K $ 56K
40K $ 35K
25K
DB2 OLAP
$35K
DB2
Warehouse
OLAP $75K
$20k Cube Views
Mining $9.5K
$20k
BI Bundle
$20k
Business
Intelligence
(included)

Manageability

Base Product
$ 25K $$116K
56K $ $154.5K
35K
Data Guard
$116K Recovery
Expert
$10k

High Availability

Business
Intelligence
(included)

Manageability

Base Product
$ 25K $ 116K
232K $ 164.5K
154.5K
$164.5K
$116K - $232K

Multi-core

High Availability

Business
Intelligence
(included)

Manageability

Base Product
$ 25K $ 232K
$348k - $464k $$164.5K
329K
What
happened?

Why did
it happen? What happened
why and how?

What will
happen?

Number of Users
Additional Benefit
OLTP – Online Transaction Processing
OLAP – Online Analytical Processing
MOLAP – Multidimensional OLAP
ROLAP – Relational OLAP
HOLAP – Hybrid OALP
Dimensions – De-normalized master tables
Attributes – Columns of Dimensions
Hierarchies – sequential order of attributes
Facts (Measure group) – Transactions tables in DWH
Fact (Measures)
Cubes – Multidimensional storage of Data
KPI’s – Key performance indicator
Dashboards – combination of reports,kpis,charts
Data Marts – Subjective Collection of Data
SCD’s – Slowly changing Dimensions
Perspectives – Child Cube
Data Reporting, OLAP,
Analysis Data Mining

Data
Storage
Repository

Data-Migration Middleware (Populations-Tools)

Operational
Data Sources
OLTP
O L A P
ROLAP MOLAP
Stage DB
Optional CUBE

SSAS
Data Marts
SSIS

S IS SSRS
S

Integration Services Analysis Reporting


Services Services
1. OLTP (on-line transaction processing) 1. OLAP (on-line analytical processing)

2. Day-to-day operations: purchasing, 2. Data analysis and decision making


inventory, banking, manufacturing, payroll,
registration, accounting, etc.
3. The tables are in the Normalized form. 3. The tables are in the De-Normalized
form.
4. We Called the Storage objects as 4. We Called the Storage objects as
Tables. i.e., All the masters and the Dimension and Facts. i.e., All the masters
Transactions are stored in the tables. Are dimension and the Transactions are
Facts.

5. For Designing OLTP we used data 5. For Designing OLTP we used


modeling. Dimension modeling.
OLAP is classified into two i.e.,
MOLAP & ROLAP
Normalized Tables De-Normalized Tables
Product_Dim
Product
Prod_Id
Prod_Id
Prod_Name
Prod_Name
Base_Rate
Base_Rate
Category
Cat_Name
Cat_Id
Cat_Id
Cat_Desc
Cat_Name
Group_Name
Group Cat_Desc
Group_Desc
Group_Id Group_Id
Group_Name Topics Later We will Cover
Group_Desc 1. Types of Dimensions
2. Slowly changing Dimensions
3. Hierarchies
SalesOrderDetails
SalesOrder_Fact
Cust_Id
Cust_Id Reference
SalesPerson
Prod_Id keys of
Prod_Id
Order_Date Dimensions
Order_Date
Delivery_Date
Booked_Date
Unit_Price Numeric
Delivery_Date fields
Qty
Unit_Price called as
Total_Amount Fact or
Qty
Tax measure
Tax
Created_By Qty*Unit_Price+Tax=Total Amount
Usually calculate all the calculations
before storing into OLAP
Prod_Dim Org_Dim
Prod_Id Org_Id
……… SalesOrder_Fact ………
Cust_Id
Prod_Id
Order_Date
Delivery_Date
Org_Id
Unit_Price Time_Dim
Cust_Dim Qty Date
Cust_Id Total_Amount Year
……… Tax Month
………
STAR Schema
Product_Dim SalesOrder_Fact
Prod_Id Cust_Id
Prod_Name Prod_Id
Base_Rate Order_Date
Cat_Name Delivery_Date
Cat_Desc Unit_Price
Group_Name Qty
Group_Desc Total_Amount
Tax
1. Dimensions will have only 1. Dimension will have a
relation with the Fact. relation other than Fact. (De-
(Normalized model) Normalized model)
2. One to many or One to 2. Used for many to many
One relation will Occur. relation.
3. Performance is fast but 3. Performance is Low but
required huge storage space. required Less storage space.

You might also like