0% found this document useful (0 votes)
4 views13 pages

Lecture6 Three Tier Architecture 11052016

The document outlines a three-tier data warehouse architecture consisting of a bottom tier (warehouse database server), a middle tier (OLAP server), and a top tier (front-end client layer). It discusses various data warehouse models including enterprise warehouses, data marts, and virtual warehouses, along with their characteristics and purposes. Additionally, it highlights the role of back-end tools for data extraction, cleaning, transformation, and the importance of a metadata repository.

Uploaded by

Krik Sha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views13 pages

Lecture6 Three Tier Architecture 11052016

The document outlines a three-tier data warehouse architecture consisting of a bottom tier (warehouse database server), a middle tier (OLAP server), and a top tier (front-end client layer). It discusses various data warehouse models including enterprise warehouses, data marts, and virtual warehouses, along with their characteristics and purposes. Additionally, it highlights the role of back-end tools for data extraction, cleaning, transformation, and the importance of a metadata repository.

Uploaded by

Krik Sha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

Three-Tier Data Warehouse

Architecture

1
2
 The bottom tier is a warehouse database server
that is almost always a relational database
system.
 Back-end tools and utilities are used to feed data
into the bottom tier from operational databases or
other external sources (such as customer profile
information provided by external consultants).
 These tools and utilities perform data extraction,
cleaning, and transformation.
 The data are extracted using application program
interfaces known as gateways.
3
 The middle tier is an OLAP server that is
typically implemented using either
 (i) A relational OLAP (ROLAP) model, that is,
an extended relational DBMS that maps
operations on multidimensional data to standard
relational operations.
 (ii) A multidimensional OLAP (MOLAP) model,
that is, a special-purpose server that directly
implements multidimensional data and
operations.

4
 The top tier is a front-end client layer, which
contains query and reporting tools, analysis tools,
and/or data mining tools (e.g., trend analysis,
prediction, and so on).

5
Metadata Repository
 Meta data is the data defining warehouse objects. It has
the following kinds
 Description of the structure of the warehouse
 schema, view, dimensions, hierarchies, derived data defn, data mart
locations and contents
 Operational meta-data
 data lineage (history of migrated data and transformation path), currency
of data (active, archived, or purged), monitoring information (warehouse
usage statistics, error reports, audit trails)
 The algorithms used for summarization
 The mapping from operational environment to the data warehouse
 Data related to system performance
 warehouse schema, view and derived data definitions
 Business data
 business terms and definitions, ownership of data, charging policies
Data Warehouse Back-End Tools and
Utilities
 Data extraction:
 get data from multiple, heterogeneous, and external
sources
 Data cleaning:
 detect errors in the data and rectify them when possible
 Data transformation:
 convert data from legacy or host format to warehouse
format
 Load:
 sort, summarize, consolidate, compute views, check
integrity, and build indicies and partitions
 Refresh
 propagate the updates from the data sources to the
warehouse
 From the architecture point of view, there are
three data warehouse models:
I. The Enterprise Warehouse
II. The Data Mart
III. The Virtual Warehouse.

8
Enterprise warehouse
 An enterprise warehouse collects all of the
information about subjects spanning the entire
organization. It provides corporate-wide data
integration, usually from one or more operational
systems or external information providers, and is
cross-functional in scope.
 It typically contains detailed data as well as
summarized data, and can range in size from a
few gigabytes to hundreds of gigabytes,
terabytes, or beyond.
 An enterprise data warehouse may be
implemented on traditional mainframes, computer
super servers, or parallel architecture platforms.
9
Data Mart
 A data mart contains a subset of corporate-wide data that
is of value to a
specific group of users.
 Data marts are usually implemented on low-cost
departmental servers that are UNIX/LINUX- or Windows-
based.
 The implementation cycle of a data mart is more likely to
be measured in weeks rather than months or years.
 Depending on the source of data, data marts can be
categorized as independent or dependent.
 Independent data marts are sourced from data captured
from one or more operational systems or external
information providers, or from data generated locally within
a particular department or geographic area.
10
 Dependent data marts are sourced directly from enterprise
data warehouses.
Virtual Data Warehouse:
 A virtual warehouse is a set of views over
operational databases. For efficient query
processing, only some of the possible summary
views may be materialized.
 A virtual warehouse is easy to build but requires
excess capacity on operational database servers.
 It is popular because is enables business to
access & analyze data from operational system

11
Distributed Data Warehouse
 Distributed data warehouses are those in which
certain components of the data warehouse are
distributed across a number of different physical
databases.
 It usually involves redundant data & as a
consequence, most complex loading and
updating process.

12
Data Warehouse Manager
 The warehouse manager is the system
component that perform all the operations
necessary to support the warehouse
management process.
 Operations performed by warehouse manager:
I. Analyze the data to perform consistency.
II. Create indexes ,Business view, Partition view against
the base data.
III. Generate new aggregations that may be required.
IV. Update all existing aggregations.
V. Transform into a star flake schema.
VI. Generate the summaries.

13

You might also like