CCS341_Data Warehousing_Unit 5 Notes
CCS341_Data Warehousing_Unit 5 Notes
UNIT-V
SYSTEM & PROCESS MANAGERS
Data Warehousing System Managers: System Configuration Manager- System Scheduling
Manager - System Event Manager - System Database Manager - System Backup Recovery
Manager - Data Warehousing Process Managers: Load Manager – Warehouse Manager-
Query Manager – Tuning – Testing
Process managers:
Process managers are responsible for maintaining the flow of data both into and out of the
data warehouse. There are three different types of process managers −
Load manager
Warehouse manager
Query manager
DATAWAREHOUSE TUNING:
A data warehouse keeps evolving and it is unpredictable what query the user is going to post in the
future. Therefore it becomes more difficult to tune a data warehouse system. In this chapter, we will
discuss how to tune the different aspects of a data warehouse such as performance, data load, queries,
etc.
Difficulties in Data Warehouse Tuning
Tuning a data warehouse is a difficult procedure due to following reasons −
Data warehouse is dynamic; it never remains constant.
It is very difficult to predict what query the user is going to post in the future.
Business requirements change with time.
Users and their profiles keep changing.
The user can switch from one group to another.
The data load on the warehouse also changes with time.
Performance Assessment
Here is a list of objective measures of performance −
Average query response time
Scan rates
Time used per day query
Memory usage per process
I/O throughput rates
Following are the points to remember.
It is necessary to specify the measures in service level agreement (SLA).
It is of no use trying to tune response time, if they are already better than those required.
It is essential to have realistic expectations while making performance assessment.
It is also essential that the users have feasible expectations.
To hide the complexity of the system from the user, aggregations and views should be used.
It is also possible that the user can write a query you had not tuned for.
Data Load Tuning
Data load is a critical part of overnight processing. Nothing else can run until data load is complete.
This is the entry point into the system.
Note − If there is a delay in transferring the data, or in arrival of data then the entire system is affected
badly. Therefore it is very important to tune the data load first.
DATAWAREHOUSE TESTING:
Testing is very important for data warehouse systems to make them work correctly and efficiently.
There are three basic levels of testing performed on a data warehouse −
Unit testing
Integration testing
System testing
Unit Testing
In unit testing, each component is separately tested.
Each module, i.e., procedure, program, SQL Script, Unix shell is tested.
This test is performed by the developer.