System & Process Managers
Syllab,
Pata Warehousing System Managers: System Configuration Manager. System Scheduling Manager
« System Event Manager - System Database Manager - System Backup Recovery Manager - Data
Warehousing Process Managers : Load Manager - Warehouse Manager « Query Manager - Tuniny
sting.
Contents
5.1 Data Warehousing System Managers
5.2. Systom Configuration Managor
5.3. System Scheduling Managor
5.4 Systom Event Manager
5.5 Systom Database Manager
5.6 System Backup Recovery Manager
5.7 Data Warehousing Process Managers
5.8 Load Managor
5.9 Warehouse Manager
5.10 Query Manager
5.11 Tuning :
5.12 Testing
5.13 Two Marks Questions with Answers
[igs Sd
(6-1)Data Warehousing 5-2
Systom & Process Managers
[EMI Data Warehousing System Managers
Data warehousing system managers are professionals responsible for overseeing the
of data warehousing systems within an
in ensuring that the data warchousing
analytical needs of the
design, implementation and_ maintenance
ion. They play a crucial role
orga
infrastructure meets the business requirements and supports the
organization,
os of data warehousing system managers :
1. System design : Data warchousing system managers collaborate with business
rators to design the structure and
stakeholders, data architects and database adminis
architecture of the data warehousing system. They analyze business requirements,
identify data sources and determine how to transform and integrate the data into the
Here are some key responsi
data warehouse
System implementation : Once the design is finalized, data warehousing system
managers oversee the implementation of the data warehousing system. They work
with a team of developers and administrators to ensure that the system is built
according to specifications, data is loaded correctly and the necessary ETL
(Extract, Transform, Load) processes are in place.
3. Performance optimization : Data warehousing ‘system managers monitor the
performance of the data warehousing system and identify areas for improvement.
They optimize data retrieval, query execution and data loading processes to
enhance system performance. They may also tune indexing, partitioning and
caching strategies to improve query response times.
Data quality and governance : Ensuring the quality and integrity of data is @
critical aspect of data warehousing. System managers establish data quality
standards, implement data cleansing and validation processes and enforce data
governance policies. They work closely with data stewards to define and enforce
data standards, data access controls and data privacy regulations.
Security and compliance : Data warehousing system managers are responsible for
maintaining the security of the data warchousing environment, They implement
security measures such as access controls, encryption and data masking to protect
sensitive information. They also ensure compliance with relevant data protection
regulations, industry standards and internal policies,
Monitoring and maintenance : System managers continuously monitor the health
and performance of the daia warehousing system, They proactively identify and
TECHNICAL PUBLICATIONS? an upatforonioagData Warehousing
5-3
‘System & Process Managers
Ive
“: perform regular system mai intenance activities and apply patches
and upgrades as nece
‘ssary. They also conduct backup a dt
: als ackup and recovery procedures to
Safeguard data against loss or corruption,
User support and training
:Di
to users of the da
‘ta warehousing system managers provide support
Warehouse: They assist in troubleshooting. issues, resolving
S and providing guidance on how to leverage the system for
S: They may also conduct training sessions to educate users on
rest practices of the data warehousing system.
archousing
data-related problem:
reporting and analysis
the capabilities and by
Overall, data w: System managers play a crucial role in ensuring that the
pani using system is well-designed, efficiently implemented and
effectively maintained to support the organization's data analytics and reporting needs.
organization's data warehou
EX system Configuration Manager
System configuration manager is responsible for managing the configuration and
deployment of the data warehousing system. This role involves ensuring that the
hardware, software and network components of the data warehouse infrastructure are
Properly configured and optimized to support the data warehousing environment.
Here are some key responsibilities of a system configuration
warehouse :
anager in a data
1. Hardware configuration : The system configuration manager works with
hardware specialists to determine’ the appropriate hardware specifications for the
data warehouse servérs, storage systems and networking components. They ensure
that the hardware is configured correctly, including setting up RAID
configurations, networking protocols and server clusters for high availability and
performance.
Software configuration ; Data warehousing systems rely on a variety of software
components, such as datbose management systems (6g, Orsi, SQL Server,
ETL tools and reporting and analytis sofware, The system configuration manager
ensures that these software components are properly installed, configured and
mized for the data warehouse environment. This includes setting up database
eee st Cee cis seta anlc Marie Ti Gaa os
os ae ; System configuration managers work closely with database
ee d system administrators to monitor and optimize the performance
ee ri mae aie asians ea identity
ame | = noe configurations to improve query response times, data
bottlenecks anc
8 an up-thrust for know
TECHNICAL PUBLICATIONS® - an up-thrus edData Warehousing 5-4
5.
1.
___Systom & Process Manopin,
loading speeds and overall system throughput. This may involve adjusting memory
allocation, disk /O settings and network configurations.
Scalability and capacity planning + Data warehouses often need t0 handle farye
volumes of data and support a growing number of users and queries over time, The
systém configuration manager is responsible for planning and implementing
strategies to scale the data warehouse system as needed. ‘This includes capac; ly
planning for storage, processing power and network bandwidth to accommodate
future growth and ensure optimal system performance.
y and access control : Data warchousing systems contain sensitive ang
valuable data, so security is a critical concern. The system configuration manager
works with security professionals to ‘implement proper access controls,
Securi
authentication mechanisms and eneryption measures to protect data within the data
warehouse. They also ensure that system configurations comply with relevant
security policies, regulations and best practices.
Change management : As the data warehouse evolves and grows, changes to the
system configuration are inevitable. The system configuration manager oversees
the change management process, ensuring that changes to hardware, software and
network configurations are properly documented, tested and implemented. They
coordinate with stakeholders, such as data architects, database administrators and
system administrators, to minimize the impact of configuration changes on the data
warehouse environment.
Documentation and documentation management : System configuration
managers maintain comprehensive documentation of the data warehouse system
configuration, including hardware specifications, software versions, system sett
and network configurations. They ensure that this documentation is up to date and
easily accessible to relevant stakeholders. They may also implement
configuration management system to track changes, versions and dependencies of
system configurations.
* By effectively managing the configuration of the data warehouse system, the system
configuration manager helps ensure that the data warehouse operates efficiently,
performs optimally and meets the organization's data
analytics and: reporting
requirements.
TECHNICAL PUBLICATIONS en up drat oknocagopata Warehousing
fA System Scheduling Manager
# System scheduling man:
‘iger is responsi : :
eee : : responsible for managing and overseeing the scheduling
IS processes and tasks within the 1s role
ee data wa 4 wolves
designing, implementing and m archousing system. This rote invol
execution of data loading,
intaining the scheduling, framework that coordinates the
transformation, inde vit
i ration, indexing, bac! ictivities
in the data warehouse, 12, backup and other refevant activi
Here are the Key responsibilities of a system scheduling manager in # data
warehouse :
1, Job scheduling : The system scheduling manager is responsible for creating and
managing job schedules within the data warehouse. They define the timing.
frequency and dependencies of various jobs, ensuring that data extraction,
transformation and loading (ETL) processes are scheduled efficiently. They
coordinate the scheduling of jobs across different systems, servers and components
to optimize resource utilization and minimize conflicts.
Workflow management : Data warchousing systems often involve complex
workflows comprising multiple interdependent tasks and processes. The system
scheduling manager designs and manages these workflows, ensuring that tasks are
executed’ in the correct sequence and with the appropriate dependencies. They
define and configure workflow templates or job orchestration tools to streamline
and automate the execution of these workflows.
3, Monitoring and alerting : The system scheduling manager monitors the execution
of scheduled jobs ‘and workflows within the data warehouse. They establish
monitoring mechanisms to track the status, progress and performance of scheduled
tasks. They also configure alerts and notifications to proactively identify and
address job failures, delays or other issues that may impact the data warehouse
operations.
4, Performance optimization + Efficient scheduling can significantly impact the
ata warehouse. The system scheduling manager analyzes the
performance of a d
utilization of scheduled jobs and workflows to
execution patterns and re
identify areas for optimization.
vehedutes, or introduce parallel processing {0 improve overall system performance
and minimize job runtime.
ree
‘They may adjust job priorities, reorganize
warehouse, certain tasks or processes may
specific data availability, completion of upstream tasks, or
cheduling manager manages these dependencies and
5. Dependency management : In a data
have dependencies 0”
‘The system s
-xternal event
@
—einGa PUBIEATONS® en apt ioestem & Process:
Date Worehousing 5-8 systo Pee
ct order to meet these dependencies,
shedule ji
They may implement mechanisms to pause, resume, or reschedule jobs based on
red data or successful completion of prerequisite tasks,
ensures that jobs are scheduled in the corre:
the availability of req
6. Maintenance and upgrades : The system scheduling manager coordinates
scheduling activities during system maintenance windows or upgrades. They plan
anges to accommodate maintenance tasks such as
tions, or hardware maintenance. They work closely
database administrators and other relevant stakeholders
and execute scheduling el
backups, database reorgat
0
to ongoing data warehouse operations.
7. Documentation and reporting : The system scheduling manager maintains
hedules, workflows, dependencies and scheduling
documentation related to job s
policies. They ensure that this documentation is up to date and readily available for
reference. They also generate reports and metrics on job execution, schedule
adherence and overall system performance, providing insights to stakeholders and
enabling continuous improvement.
© Overall, the system scheduling manager plays a critical role in optimizing the execution
and coordination of tasks within a data warehousing system. By effectively ‘managing
job schedules, dependencies and: workflows, they ensure the timely and accurate
availability of data for analysis and reporting, supporting the organization's data-driven
decision-making processes.
[EJ system Event Manager
‘¢ System event manager is responsible for monitoring and managing system events that
occur within the data warehousing system. This role involves overseeing event logging,
alerting and incident management to ensure the smooth operation of the data warehouse
and timely resolution of any issues or anomalies.
© Here are the key responsibilities of a system event manager in a data warehouse :
1. Event monitoring and logging : The system event manager is responsible for
implementing event monitoring mechanisms within the data warehouse syste
‘They configure the system to capture and log events related to various componeats.
such as servers, databases, storage systems, ETL processes and other relevant
elements. This includes monitoring system log
metrics.
error messages and performance
2. Alerting and notification : The system event manager sets up alerting
mechanisms to proactively notify relevant stakeholders about critical events oF
TECHUCAL PUBLICATIONS? an ptt or onioopo4
Data Warehousing
a
5
System & Process Managers
nomalies. ‘They configure
Predefined conditions, ‘Thes
communic:
thresholds and rules to trigger alerts based on
¢ alerts may be sent via email, SMS, or other
| 'S, ensuring that the right individuals or teams are promptly
appropriate action,
Incident management :
sy
: “ation channels, ens
Notified to take
seein i system events indicate issues or anomalies, the
ee sponsible for incident management. They investigate the
is I ipact and initiate appropriate actions to resolve the incidents.
This may involve troubleshooting, coordinating with rel h
database adniinistrotore, : ee ating with relevant teams (suc! fed
: , System administrators or ETL developers) and escalating
issues as necessary,
Problem identification and -root cause analysis : The system event manager
performs root cause analysis to identify underlying issues contributing to recurring
or critical events. They analyze event pattems, system logs and performance
metrics to understand the cause of problems and implement preventive measures.
This includes identifying performance bottlenecks, configuration issues, or data
inconsistencies that may lead to system events.
Performance nionitoring and optimization : System events can provide insights
into the performance of the data warchouse system. The system. event manager
monitors performance-related events, such as query execution times, resource
utilization or data loading bottlenecks. They analyze these events to identify areas,
for optimization and work with relevant teams to implement performance tuning
measures.
Documentation and reporting : The system event manager maintains
documentation related to system events, incident management processes and
resolutions. They ensure that event logs, alerts sind incident reports are properly
documented and readily accessible for reference and future analysis. They generate
reports on event trends, incident resolu n times and system stability to provide
hts to stakeholders and support desision-making,
‘The system event manager plays a proactive role in
arehouse system. They analyze historical event
insi
Continuous improvement :
continuously improving the data w:
data, identify opportunities for autom
* nt recurring incidents. They may
rat correlation mechanisms, or predictive analytics models to
performance.
ation and implement, proactive measures to
yy also contribute to the implementation of
prevent rec
g tools, evel
monitorin} i"
gystem’s stability and
enhance the
—mnGA PUBLENTONS® pt ro& Pro
Data Warehousing 5-8 ‘Systom & Process Managers
© By effectively managing system events within the data warehouse, the system event
manager helps ensure the reliability, stability and optimal performance of the data
Warehousing environment, They facilitate timely incident resolution, provide insights
for performance optimization and contribute to the ‘overall operational efficiency of the
data warehouse system.
[system Database Manager
© System databaie manager is responsible for overseeing the management and
administration of the databases within the data warchousing system. This role involves
ensuring the availability, integrity, security and performance of the database
infrastructure that supports the data warehouse.
es of a system database manager in a data
© Here are the key
‘espon’
warehouse :
1. Database design and architecture : The system database manager collaborates
with data architects and system designers to define the database design and
architecture for the data warehouse. They work on the logical and physical design
of the databases, including table structures, indexes, partitioning. and data
distribution strategies to optimize query performance and data retrieval.
Database installation and configuration : The system database manager is
responsible for installing and configuring the Database Management System
(DBMS) software required for the data warehouse. They ensure that the DBMS is
properly set up, including configuring storage parameters, security settings,
memory allocation and network connectivity.
Database security and access control : Data warehouses often contain sensitive
and valuable data, The system database manager is responsible for implementing
security measures to protect the database environment. This includes setting up
access controls, authentication mechanisms and encryption for data at rest and in
transit. They also ensure compliance with relevant data protection regulations and
internal security policies,
Database performance optimization : The system database manager monitors the
performance of the database system and takes. measures to optimize its
performance. They analyze database performance metrics, identify bottlenecks and
tune database configurations, indexing strategies, query execution plans and
caching mechanisms to improve overall system performance and response times.
TECHNICAL PUBLICATIONS® - an up-hrust for knowledge—
pata Wareho
: B:8 ___Systom & Procoss Managers
Backup an
dl recovery :
'Y : Data warchouses store efitieal business data and ensuring
amount, ‘Phe
: unt, The system database manager establishes backup and
for ub aysiomne!
eee he database system. ‘They schedule regular backups,
isaster recovery mechanisi
oe nisms and test the restore processes to ensur
the integrity and recover oe
5.
its availability is y
recovery. strategies
ability of data in case of system failures or data loss.
6. Database monitoring av
Hae Re and nainfenanee ¢ he system database manager monitors
alth and availability of the database system. ‘They perform regular system
ar updates and bug fixes to the DBMS
software, They also monitor database storage, file systems and logs to proactively
ues related to database growth, performance degradation, or
maintenance tasks such as applying patche
identify and resolve is
system errors:
7. Database capacity planning + Data warehouses often handle large volumes of
data, The system database manager performs capacity planning to ensure that the
database system can accommodate the data growth and analytical requirements of
the organization. They analyze storage requirements, resource utilization and
system performance trends to predict future capacity needs and recommend
infrastructure upgrades or scaling strategies.
8. Data governance and complianee : The system database manager works closely
With data governance teams to ensure compliance with
data quality standards and regulatory requirements. They participate in data
tiatives, establish data governance controls within the database
force data access controls, data retention policies and data
data governance policies,
governance ini
environment and i
privacy regulations.
1e system database manager ‘a crucial role in managing and maintaining
plays
1 warehouse, By ensuring database availabilt
werall success of the data
© Overall, th
the databases that underpin the da
surity and compliance,
mand enable efficient and
performance, 8° they contribute to the o
{ reliable data analysis ancl reporting,
warehousing syste!
Ed system Backup Recov
eovery mang
processes of t
centing backup an'
in the event of sys
ery Manager
er is res
he data warehousing system, This role involves
integrity and
© System backup Te sponsible for managing and overseeing the
backup and recovery
designing and implem
minimizing downtime |!
\d recovery strategies, ensuring data i
tem failures or data loss.data Warehousing 5-10 System & Process Managers
© Here are the key responsibilities of a system backup recovery manager in a data
warehouse :
The system backup recovery manager develops
1. Backup strategy and plannin;
ta warehouse system,
and implements a comprehensive backup strategy for the dat
They assess the criticality and recovery objectives of different data components
and determine appropriate backup methods ‘and frequencies. This includes
incremental backups, differential backups or a
establishing full backups,
combination thereof to meet the Recovery Point Objectives (RPO) and Recovery
Time Objectives (RTO) of the organization.
2. Backup execution and’ verification : The system backup recovery manager
oversees the execution of regular backups for the data warehouse. They ensure that
backup schedules are adhered to and monitor the completion and success of backup
jobs. They also conduct regular backup verification checks to validate the integrity
of the backup data and ensure it is recoverable when needed.
Recovery strategy and planning : The system backup recovery manager develops
a recovery strategy for the data warehouse system. They identify potential failure
scenarios, such as hardware failures, software corruption, or data loss and define
appropriate recovery procedures and priorities. They establish recovery points and
recovery methods to restore the data warehouse to a consistent and usable state in
the event of a disaster or system failure.
4. Recovery execution and testing : In the event of a failure or data loss, the system
backup recovery manager leads the recovery process. They initiate the recovery
procedures, restore data from backups and coordinate with relevant teams, such as
database administrators or system administrators, to bring the data warehouse
system back online. They conduct recovery testing exercises periodically to
validate the effectiveness and reliability of the recovery procedures.
5, Backup storage and retention : The system backup recovery manager manages
the storage and retention of backup data, They determine appropriate storage
solutions, such as disk-based backup, tape backup, or cloud-based backup, based
on organizational requirements and cost considerations. They establish retention
policies to retain backups for a specified period, ensuring compliance with dat
governance policies, regulatory requirements and business needs.
6. Disaster recovery planning : The system backup recovery manager participates i
the development and implementation of the data warehouse's disaster recovery
plan, They collaborate with relevant stakeholders to identify potential risks, define
TECHNICAL PUBLICATIONS®
In upsthnist for knowledge.pata Warehousing
8s 4; __ System & Process Managers.
Tecovery objectives, establish recovery site options and outline procedures for
recovering the data warehouse in the event of a major disaster ot disruption.
Monitoring and reporting + ‘The system backup recovery manager monitors
backup and recovery activities, trucks key metrics and generates reports on backup
Success rates, recovery times and other performance indicators. They provide
regular status updates to management and stakeholders, ensuring transparency and
awareness of the backup and recovery processes.
Continuous improvement : ‘The system backup recovery manager continually
assess
and improves the backup and recovery processes. They stay updated with
advancements in backup and recovery technologies, evaluate new tools and
methodologies and implement improvements to enhance the efficiency, reliability
and speed of backup and recovery operations.
* By effectively managing the backup and recovery processes, the system backup
recovery manager helps ensure the availability and integrity of the data warehouse
system, They minimize thé risk of data loss, enable quick recovery from system failures,
and support the organization's data-driven decision-making processes.
Data Warehousing Process Managers
© Ina data warehouse environment, there are several key process managers involved in
managing the data warehousing processes. These managers oversee various stages of the
data warehousing lifecycle, from data extraction and transformation to data loading and
reporting, Here are some of the key process managers in a data warehouse :
1.
Extraction manager : The extraction manager is responsible for managing the
process of extracting data from various source systems or external data providers.
‘They coordinate the identification of relevant data sources, establish data extraction
methods and schedules and ensure the timely and accurate retrieval of datz from
source systems into the data warehouse,
‘Transformation manager : The transformation manager oversees the data
ich involves cleaning, filtering, aggregating and
transformation process, wl
integrating the extracted data to align it with the data warehouse schema and
business rules, They design and implement data transformation logic, co-ordinate
the execution of transformation processes and ensure data quality and consistency
throughout the transformation stage.
Loading manager # The loading manager is responsible for managing the data
Toading process ilo the data warehouse, They define and implement loading
TECTIA. PUBLIOATIONS® - an prt rine5.12 Syston & Process Managers.
lata loading schedules and priorities and co-ordinate the
appropriate tables or data structures within
and optimize the loading process to ensure
procedures, establish d
movement of transformed data into the
the data warehouse, They also monitor
efficient and reliable data ingestion.
Metadata manager : The metadata manager overse
metadata within the data warehouse. They define and maintain metadata standards,
‘of metadata related to data sources, data
transformations, data models and data lineage. They ensure that metadata is
accessible for data analysis, reporting and
es the management of
coordinate the eapture and storage
accurate, up to date and readily
governance purposes.
rhe secitrty mavinger is responsible for managing the security
iplement access controls,
Security manage
of the data warehouse environment. They define and im
authentication mechanisms and data privacy measures 0 protect sensitive data
the data warehouse. They collaborate with stakeholders to establish security
iance with relevant regulations
a
withi
policies, conduct security audits and ensure comp!
and organizational requirements.
Reporting manager : The reporting manager ‘oversees the process
reports and delivering analytical insights from the data warehous
closely wiih business users to understand their reporting requirements, design and
develop reporting solutions and ensure the availability and accuracy of data for
reporting purposes. They may also be responsible for managing report scheduling,
distribution and performance optimization.
of generating
e. They work
Data quality manager : The data quality manager is responsible for ensuring the
integrity and quality of data within the data warehouse. They define and enforce
data quality standards, implement data quility checks and validation processes and
establish data cleansing and data enrichment strategies. They monitor data quality
metrics, identify data quality issues and collaborate with data stewards and dats
‘owners to address data quality problems.
Change management manager : The change management manager oversees the
process of implementing changes to the data warehouse environment. They
coordinate the planning, testing and deployment of changes, such as seem?
modifications, ETL process enhancements, or infrastructure upgrades. They ens
that changes are properly documented, communicated and implemented
minimizing disruptions to ongoing data warehouse operations.
TECHNICAL PUBLICATIONS® « an up-thrust for knowledge.Data Warehousing
5-13 ‘System & Process Managers.
These process managers play crucial roles in managing the various stages and
components of the data Warehousing process. They collaborate with technical teams,
business users and stakeholders to ensure the successful implementation, operation and
maintenance of the data watchou
decision-making and anal
system, supporting the organization's data-driven
lytical requirements,
Process managers are accountable for maintai
the data warehouse. There
ing the flow of data both into and out of
are three different types of process managers -
1. Load manager 2. Warehouse manager 3. Query manager.
[EB Load Manager
The actions necessary to extract and load the data into the database are carried out by the
load manager. A load manager's size-and complexity vary depending on the specific
solutions used by each data warehouse.
Load manager is responsible for overseeing the process of loading data into a data
warehouse. Their primary focus is on managing the ETL (Extract, Transform, Load)
Processes and ensuring the successful iniegration of data from various sources into the
data warehouse. Here are some key responsibilities and tasks of a load manager in a data
warehouse :
1. ETL process management : The load manager oversees the ETL processes, which
involve extracting data from source systems, transforming it to conform to the data
warehouse schema and loading it into the data warchouse. They ensure that the
ETL processes run smoothly, efficiently and meet the defined schedules.
2. Data mapping and transformation : The load manager collaborates with data
analysts and data engineers to define the mapping’ rules and transformations
required to convert the source data into the target data warehouse format. They
ensure that data transformations, such as data cleansing, aggregation, or
consolidation, are accurately applied during the loading process.
3. Data loading optimization : The load manager works closely with the database
administrators and data engineers to optimize the data loading process. They
monitor system performance, identify bottlenecks and fine-tune the loading
operations for optimal efficiency, including optimizing data transfer rates and
minimizing load times.
4. Error handling and data quality : Managing data quality is a critical aspect of
the load managers role, They implement error handling mechanisms to capture and
resolve data loading errors, perform data quality checks and ensure the integrity
and aceuracy of the loaded data.
TECIIIGAL PUBLICATIONS® on uptnorknoviodgoData Warehousing 5-14 System & Process Managers
5. Metadata management : The load manager is responsible for managing metadata
associated with the data loading process. They maintain documentation of data
sources, data mappings, transformation rules and other relevant information to
censure a clear understanding of the data lineage and its transformations within the
data warchouse.
Data validation and testing : The load manager performs data validation and
testing to ensure the accuracy and consistency of the loaded data. They verify the
loaded data against predefined business rules, conduct data reconciliation and
ensure that the data warehouse meets the required quality standards.
7. Monitoring and performance reporting : The load manager monitors the data
loading processes and generates performance reports and metrics. They track data
volumes, loading times, success rates and error rates to evaluate the effectiveness
and efficiency of the loading operations.
Collaboration and communication : The load manager collaborates with various
stakeholders, including data architects, data analysts, business users, and IT teams,
| They work closely with these stakeholders to understand their data requirements,
~ communicate progress and issues and ensure that the data loading processes align
with the business needs.
© Overall, the load manager in a data warehouse plays a crucial role in managing the ETL
processes, ensuring the successful integration of data from multiple sources into the data
warehouse and maintaining data quality and integrity throughout the loading operations.
[BERT Load Manager Architecture
Load manager
| Contoting
_ process
Stored Temporary
procedures data store
Copy,
management tool
‘Warehouse
structure
Fast loader:
Fig. 5.8.1 Load manager architecture
TECHNICAL PUBLICATIONS® -an up trast orknowodgeWarehousing
pote 5-15 System & Process Managers
# The load manager carries out the following tasks:
1, Take information out of the source system,
2, The extracted data quickly into a temporary data store.
Execute basi fc
3, ic transformations to ereate a structure that resembles the data
warehouse,
«Fig. 5.8:1 shows load manager architecture. (Refer Fig. 5.8.1 on previous page)
extract data from source
¢ The information is taken from operational databases or third-party information sources.
The application programmes used to extract data are called gateways. It enables the
client programme to generate SQL that will be run at a server and is supported by the
underlying DBMS. Examples of gateways include Open Database Connection (ODBC)
and Java Database Connection (JDBC).
Fast load
1. The data must be put into the warehouse as soon as feasible in order to reduce the
overall load window.
2. Data processing speed is impacted by transformations.
3. It is more efficient to conduct transformations and checks after loading the data
into a relational database. '
4, Since gateway technology is ineffective when dealing with big data quantities, itis
not appropriate.
Warehouse manager
© It could be necessary to carry out simple changes when loading. One can perform
complicated tests when simple modifications are finished consider that when EPOS
sales transaction is loaded, one should make following checks :
1. Remove all the columns from the warehouse that are not necessary.
2. Convert each value to the appropriate data type.
EX] Warehouse Manager
© The warehouse management procedure falls within the purview of the warehouse
anager. It is made up of shell seripts, C programmes and unofficial system software. A
warehouse manager's size and complexity vary depending on the particular solution.
© A warehouse manager in the context of a data warehouse is responsible for overseeing
the operations and management of the physical infrastructure and resources that support
[ECTINGAL PUBLICATIONS®- on upto knwo
=5-16 __Syslom & Process Manage
‘Their primary focus is on ensuring the smooth
the data warchousing environment.
ind of
os and tasks of a warehouse manager in a data
mizing its performance.
functioning of the data warehouse
Here are some key responsibiliti
warehouse :
1. Inf
‘mana
tructure management : ‘The warehouse manager is responsible for
‘sing the physical infrastructure of the data warchouse, which includes servers,
storage systems; networking equipment and other hardware components, They
lability, reliability and scalability of the infrastructure to support
ensure the
The warehouse manager works closely with database
istrators to optimize the performance of the data
warehouse, They monitor system performance, identify bottlenecks and implement
strategies to improve query response times, data loading speeds and overall system
efficiency.
3. Capacity planning : The warehouse manager forecasts future storage and
processing requirements and performs capacity planning for the data warehouse,
They assess the growth of data volumes, analyze usage patterns and collaborate
with IT teams to ensure that the data warehouse has adequate resources to handle
increasing data demands.
4. Backup and recovery : The warehouse manager is responsible for implementing
backup and recovery strategies to protect the data warehouse from data loss or
system failures. They establish regular backup schedules, test data recovery
procedures and ensure the integrity and availability of backup data.
5. Security and access control : Data security is a critical aspect of a data warehouse
and the warehouse manager plays a key role in ensuring the security of data assets.
They establish and enforce security policies, implement access controls, monitor
user privileges and work closely with IT teams to address any security
vulnerabilities or threats.
6. Data retention and archiving : The warchouse manager collaborates with data
governance teams to define data retention policies and implement data archiving
strategies. They ensure that data is retai
ed for the required periods, manage data
Purging processes and oversee the archival of historical data for long-term storage. *
7. Disaster recovery planning : In the event of a disas
warehouse manager is responsible for dis
IT teams to develop and maintain di:
ler or system failure, the
ster recovery planning, They work with
ster recovery plans, conduct regular testing
and ensure the data warehouse can be quickly restored in case of emergencies
TECHNICAL PUBLICATIONS® - an up-tnstforknowiodgo
lc a i ictal |Data Warehousing
8.
10.
oe Systom & Process Managers
Vendor and contract management : The warchouse manager may be involved in
managing relationships with hardware and software vendors. They evaluate and
select vendors, negotiate contracts and ensure the timely procurement of necessary
equipment or software licenses for the data warehouse.
Documentation and reporting : The ‘warchouse manager maintains
documentation related to the. data warchouse infrastructure, configurations and
operational procedures. They generate reports on system performance, capacity
utilization and operational metrics to provide insights and support decision-making
processes. .
Team management : In some cases, the warehouse manager may oversee a team
of administrators and technicians responsible for the day-to-day operations of the
data warehouse. They provide guidance, mentorship and performance management
to ensure the team's effectiveness.
© Overall, the warehouse manager in a data warehouse plays a crucial role in managing
the
physical infrastructure, optimizing performance, ensuring data security and
maintaining the availability and reliability of the data warehouse environment.
Warehouse Manager Architecture
Warehouse manager
Controliing Temporany ;
ata store:
process
‘Stored
procedures or
‘Cwith SQL
Backup / recovery
tool
SQL scripts.
Fig. 5.9.1 Warehouse manager architecture
© A warehouse manager includes the following -
Be
‘The controlling process
Stored procedures or C with SQL
Backup/Recovery tool
SQL scripts.
TECINIGAL PUBLICATIONS®™-en uns or knowodgoDota Warehousing 5518.
ton & Pro60ne Managory
EEE] Functions of Warehouse Manager
© Awarchouse manager performs the followlng functions ©
vy andl roferential integrity cheeks.
Analyzes the data to perform consistene
2. Creates indexes, business views, partition views against the base data,
ting. agyuregations:
Generates new aggreyations ancl pytates the ew
tion:
4. Generates normali:
5. Transforms and merges the source data of the temporary store into the published
data warehouse.
6, Backs up the data in the data warehouse.
7. Archives the data that has reached the end of its captured fi
ESD) Query Manager
© The queries must be routed to the appropriate tables by the query manager. It expedites
the query request and answer procedure by routing the queries to the proper tables. The
1 when the user-posted queries will be
query manager is also in charge of plana
executed.
and optimize the execution of
© Query manager is a component or tool that helps
queries against the data stored in the data warehouse. The query manager performs
several functions to ensure efficient query processing and enhance the performance of
the data warehouse.
© Here are some key functions of a query manager
1. Que
optimizes their execution plans to minimize resource consumption and maximize
adata warehouss
y optimization incoming queries and
The query manager analyze:
performance. It may employ techniques like query rewriting, indexing, caching and
statistics analysis to improve query execution speed.
2. Query parsing and validation : The query manager parses
queries to ensure they conform to the data warehou
and validates incoming
e's syntax and semantics. It
checks for syntactical correctness, resolves table and column references and
verifies that the requested data exists in the data warehouse.
3. Query routing and load balaneing : In distributed data warehouse environments,
the query manager may route incoming queries to the appropriate nodes or servers
for execution. It ensures load balancing across the nodes, distributing the quety
workload evenly to optimize resource utilization.
TECHNICAL PUBLICATIONS® - an up-thrust for knowledgoData Warehousing ca
7
_Systom & Process Managers
4. Query caching : The query manager may cache frequently executed queries and
their results to reduce the overh
read of query processing. By caching the results,
* subsequent identical or s
imilar queries can be served from the cache, avoiding the
ion.
need for costly re-comput
Conew
reney control : Ina multi-user environment, the query manager ensures
Proper conctirreney control to handle simultancous execution of multiple queries. tt
manages locks and transaction isolation levels to maintain data integrity and
prevent conflicts among concurrent queries
6. Query monitoring and performance analysis : The query manager monitors the
performance of executing queries and collects statistics such as query execution
time, resource consumption and VO operations. It generates performance reports,
identifies bottlenecks and provides insights for query performance tuning.
7. Query scheduling and prioritization.: The query manager may schedule and
prioritize queries based on factors such as query complexity, user priority and
resource availability. It ensures fair resource allocation and delivers timely query
responses-based on user requirements.
8. Query logging and auditing : The query manager maintains logs of executed
queries, including query text, execution time and other relevant metadata, It
supports auditing, troubleshooting and compliance requirements by providing a
historical record of query activities.
9. Query security and access control : The query manager enforces access control
policies to ensure that users can only execute queries on authorized data. It
validates user credentials, implements role-based access control and enforces dats
security rules to protect sensitive information.
Integration with business intelligence tools : The query manager may integrate
with business intelligence tools or reporting platforms to enable seamless query
execution and data visualization for end users. It provides the necessary
connectivity and interfaces for data analysis and reporting.
These functions collectively enable the query manager to optimize query performance,
ensure data integrity and deliver timely and accurate results to users querying the data
warehouse.
10.
TZUIINCAL PUBLICATIONS® on wpa rideDato Warohousing 5:20 Syslom & Procoss Manager
Quory Manager Architecture
Query manager
‘Query management
Query reditection |} Stored procedures tool Query scheduting
ee tator ROBMS|| (generating views) |] (to monitor indexes ee
‘and summaries)
= eS
Meta Det
data Information
Fig. 5.10.1 Query manager architecture