0% found this document useful (0 votes)
29 views

Lecture 13

1. A data warehouse stores large amounts of historical transaction data extracted from operational systems like ERP in a structured and integrated way to facilitate analysis. 2. It uses multi-dimensional models like star schemas to organize data by facts and dimensions. This allows for analysis across various metrics over time. 3. Tools like OLAP enable interactive analysis of the stored data through operations like slicing, dicing, pivoting, drill-down and roll-up.

Uploaded by

Rajpoot Baba
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views

Lecture 13

1. A data warehouse stores large amounts of historical transaction data extracted from operational systems like ERP in a structured and integrated way to facilitate analysis. 2. It uses multi-dimensional models like star schemas to organize data by facts and dimensions. This allows for analysis across various metrics over time. 3. Tools like OLAP enable interactive analysis of the stored data through operations like slicing, dicing, pivoting, drill-down and roll-up.

Uploaded by

Rajpoot Baba
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 17

Lecture 13

Enterprise Systems
Email: [email protected]
Data Warehousing
 If the company wants to see transaction order of an item for the last 5 to 10 years, the company
needs to store transaction entries for that period. These huge volumes of data is bound to make
the system slow. So a data warehousing application emerge that helps in extracting data from
transaction systems (like ERP or CRM), cleaning it and storing it.
 Storing data alone is not enough if it can not provide the required information. Business
intelligence and OLAP applications help to analyze the data in multiple dimensions and provide
reports in different formats.

05/07/2023 2
Data warehouse
 Data warehouse sources data from various operational systems, organizes and stores it in a form that is
standardize, structured, clean and integrated. Data warehouse has tools to extract, transform and load
data. Data can come from various sources, like ERP, CRM, old systems, and data can be accessed by a
business intelligence system or other analytical tools.
 Four Important characteristics of Data Warehouse:
- Subject-oriented: Data warehouse is organized around subjects such as sales, product and customer.
It focuses on modelling and analysis of data for decision makers related to a particular area.
- Integrated: Data warehouse is constructed by integrating multiple sources like ERP, CRM, old
systems. Etc.
- Time-variant: Provides information from historical perspective, e.g. past 5-10 years.
- Non-volatile: Data once recorded cannot be updated, i.e. can not be changed until the next refresh.

05/07/2023 3
Advantages and Disadvantages of Data warehouse
 Advantages:
1. Clean data: Data warehouse ensures that the data in the data warehouse is clean and consistent. Prior to loading data into the data
warehouse, inconsistencies are identified and resolved. This greatly simplifies reporting and analysis.
2. Can store data for longer time and operational system’s performance do not slow down:
Data warehouse can store data for long time. If huge volume of data is kept in transactional systems (like ERP or CRM), this may impact
the performance of those systems. This information can be stored safely in data ware house system for extended periods of time.
3. Facilitate analytics systems by providing right data:
Data warehouses facilitate different business intelligence (BI), and analytic applications by providing data. Based on the available data in the
data warehouse, these systems can produce different reports that show performance.
 Disadvantages:
1. Data latency: As data must be extracted, transformed and loaded into the warehouse, there is an element of latency in data warehouse
data, i.e., there will be always a delay between when data becomes available in transaction system and when it is available in the warehouse.
2. Storing unstructured data: Data that is unstructured (i.e. video files, audio files, etc.) is difficult to stores in the data warehouse.
3. Non-volatile: Can not change or update data frequently.

05/07/2023 4
Data Warehouse Components
1. Multi-dimensional Models:
Multi-dimensional data models are needed for the creation of data warehousing. There are mainly two different
multidimensional models are used in data warehouse modeling:
1.1 Data cube
A data cube enables data to be modeled and viewed in multiple dimensions. It is defined by dimensions and measures.
The dimensions are the perspectives or entities concerning which an organization keeps records. For example, a shop may
create a sales data warehouse to keep records of the store’s sales for the dimension date, product, and location. These
dimensions allow the save to keep tract of things, for example, monthly sales of products and the locations at which the
products were sold. Each dimension has a table related to it, called a dimensional table, which describes the dimension
further. For example, a dimensional table for a product may contain the attributes product_id, name, color, size.
Measures are the numerical values related to the dimensions.
Example: Please follow the example mentioned in Lecture for data cube.

05/07/2023 5
Data Cube

05/07/2023 6
1.2 Star Schema:
The star schema is the frequently used multi-dimensional model for relational databases. This
database schema classifies two groups of data: facts and dimension.
Star schema uses fact and dimension tables and is composed of a single fact table and for each
dimension one-dimension table. Fact tables are related to each dimension table in a many to one
relationship (Primary/Foreign Key Relationships). The dimension tables are linked relationally
with the central fact table by way of foreign or primary key relationships.
 A user can get quick answer to the questions like Whom have we sold to ?
What have we sold? How much have we sold? and When did we sell it?, from the fact table.

05/07/2023 7
Star Schema example for Sales record

05/07/2023 8
2. Data Mart
This is a subset of a data warehouse that supports the requirements of users associated with one
department or business function.
- Small
- Departmentally structured

05/07/2023 9
3. Operational Data Store (ODS)
an ODC consolidates data from multiple source systems and provides a near real time
integrated view of current data. Its purpose is to provide integrated data for operational purpose
and it has add, change or delete functionality. ODS may contain 30 to 60 days of information,
while a data warehouse typically contains years of data.
Advantage:
Overwrite function
ODS is designed to quickly perform relatively simple queries on small amounts of data (such as
finding the status of a customer order). An ODS is similar to short term memory in that it stores
only very recent information.

05/07/2023 10
Data Warehousing Architecture

05/07/2023 11
1. There are multiple external sources and operational databases from where data is extracted.
These external sources can be ERP or CRM systems, etc.
2. This data is extracted, transformed, cleaned and loaded in a data warehouse.
3. One data warehouse can be source of data for multiple data marts.
4. From data warehouse, the data is brought to OLAP servers of data analysis tools as per
different reporting requirement.
5. Finally, different data analysis tools helps in different sorts of analysis of data.
So data warehouse follow a three-tier architecture , i.e. data source layer, data warehouse layer, and
data analyses layer.

05/07/2023 12
Online Analysis Processing (OLAP)
 OLAP enables a user to easily and selectively extract and view data from different points of view. OLAP data is
stored in a multi-dimensional database, provide access to data for analysis and quickly answer multi-
dimensional analytical queries.
OLAP Functionalities
1. Roll up and roll down: Roll up or drill up summaries data by climbing up hierarchy, i.e., sales of each office
can be rolled up to city, which can be further rolled up to state, region and country. Roll down or drill down is
reverse of roll up, i.e., from higher summary- getting into lower level summary or detailed data.
2. Slice and Dice: Slice: This helps in viewing data that is lying in data warehouse in different ways. For
example: Someone wants to see only food sales data across different regions can run a quick OLAP query to
get the information.
Dice: In dice, we select 2 or more dimensions that result in the creation of a sub cube.
3. Pivot: A rotated view of dimensions.

05/07/2023 13
Slicing

05/07/2023 14
DICE

05/07/2023 15
Pivot

05/07/2023 16
Types of OLAP
1. MOLAP
Multi-dimension Online Analytical Processing (MOLAP): Uses a multidimensional OLAP to
store and access data. Usually it requires data access tools. Here data stored in multi-dimensional
arrays and dimensions used to index array. This is an array based storage structure and there can be
access to array data structures.
2. ROLAP
Relational Online Analytical Processing (ROLAP): This uses a relational database and OLAP
environment and typically involves a star schema to provide the multi-dimensional capabilities.

05/07/2023 17

You might also like