FD Unit 2
FD Unit 2
DATA WAREHOUSE
Introduction to Data Warehouse
Data Warehouse is a relational database management system (RDBMS) construct to meet the requirements of
transaction processing systems.
The Data Warehouse environment contains an extraction, transportation, and loading (ETL) solution, an
online analytical processing (OLAP) engine, customer analysis tools, and other applications that handle the
process of gathering information and delivering it to business users.
Data Warehouse
o It is a database designed for investigative tasks, using data from various applications.
o It supports a relatively small number of clients with relatively long interactions.
o It includes current and historical data to provide a historical perspective of information.
o Its usage is read-intensive.
o It contains a few large tables.
Characteristics of Data Warehouse
Subject-Oriented
Subject-Oriented: A data warehouse can be used to analyze a particular subject area. For example,
"sales" can be a particular subject.
Integrated
A data warehouse integrates various heterogeneous data sources like RDBMS, flat files, and online
transaction records. It requires performing data cleaning and integration during data warehousing to
ensure consistency in naming conventions, attributes types, etc., among different data sources.
Time-Variant
Historical information is kept in a data warehouse. For example, one can retrieve files from 3
months, 6 months, 12 months, or even previous data from a data warehouse. These variations
with a transactions system, where often only the most current file is kept.
Non-Volatile
Non-volatile: Once data is in the data warehouse, it will not change. So, historical data in a data
warehouse should never be altered.
Data Warehousing Objectives
1) Business User: To view historical data that has been summarised, business users need access to a data
warehouse. Given that these individuals lack technological expertise, the information could be conveyed to
them in a simple format.
2) Archive historical data: Historical time-variable data must be stored in a data warehouse. There are
multiple uses intended for this input.
3) Make strategic choices: Depending on the information in the data warehouse, some tactics may be
implemented. Thus, data warehouses aid in the process of making strategic choices.
4) For data quality and consistency: By combining data from several sources into one location, the user
can efficiently work to improve the uniformity and consistency of data.
5) Quick response time: The data warehouse must be prepared for occasionally unforeseen loads.
Data Warehouse Requirement
Advantages of Data Warehouse
1. Enterprise warehouse
2.Data mart
3.Virtual warehouse
Enterprise warehouse:
• An enterprise warehouse collects all of the information about subjects spanning the entire
organization.
• It provides corporate-wide data integration, usually from one or more operational systems or
external information providers, and is cross-functional in scope.
• It typically contains detailed data as well as summarized data, and can range in size from a few
gigabytes to hundreds of gigabytes, terabytes, or beyond.
• An enterprise data warehouse may be implemented on traditional mainframes, computer super
servers, or parallel architecture platforms. It requires extensive business modelling and may
take years to design and build.
Data mart: