0% found this document useful (0 votes)
44 views44 pages

Ch6 - Data and Database Administration

This document discusses key topics in data and database administration including: 1. The roles and functions of data administration and database administration including managing data policies, security, performance, and recovery. 2. Evolving approaches to data administration such as merging roles and focusing on development processes. 3. Database security threats and techniques for mitigating risks like access control, encryption, authentication, and backup/recovery protocols.

Uploaded by

hasaniftakhar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
44 views44 pages

Ch6 - Data and Database Administration

This document discusses key topics in data and database administration including: 1. The roles and functions of data administration and database administration including managing data policies, security, performance, and recovery. 2. Evolving approaches to data administration such as merging roles and focusing on development processes. 3. Database security threats and techniques for mitigating risks like access control, encryption, authentication, and backup/recovery protocols.

Uploaded by

hasaniftakhar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 44

Chapter 6

Data and Database


Administration

1
Objectives
• Definition of terms

• Functions and roles of data/database administration

• Role of data dictionaries and information repositories

• Optimistic and pessimistic concurrency control


• Problems and techniques for data security
• Problems and techniques for data recovery

• Database tuning issues

• Importance and measures of data quality

• Importance and measures of data availability 2


Traditional Administration Definitions
• Data Administration: A high-level function that
is responsible for the overall management of
data resources in an organization, including
maintaining corporate-wide definitions and
standards
• Database Administration: A technical function
that is responsible for physical database
design and for dealing with technical issues
such as security enforcement, database
performance, and backup and recovery

3
Traditional Data
Administration Functions
• Data policies, procedures, standards
• Planning
• Data conflict (ownership) resolution
• Internal marketing of DA concepts
• Managing the data repository

4
Traditional Database
Administration Functions
• Selection of hardware and software
• Installing/upgrading DBMS
• Tuning database performance
• Improving query processing performance
• Managing data security, privacy, and integrity

• Data backup and recovery


5
Evolving Approaches
to Data Administration
• Merge data and database administration into one role
• Fast-track development – monitoring development
process (analysis, design, implementation,
maintenance)
• Procedural DBAs – managing quality of triggers and
stored procedures
• eDBA – managing Internet-enabled database
applications
• PDA DBA – data synchronization and personal
database management
• Data warehouse administration 6
Data Warehouse Administration
• New role, coming with the growth in data
warehouses
• Similar to DA/DBA roles
• Emphasis on integration and coordination of
metadata/data across many data sources
• Specific roles:
– Support decision–support applications
– Manage data warehouse growth
– Establish service level agreements regarding data
warehouses and data marts
7
Open Source DBMSs
• An alternative to proprietary packages such as
Oracle, Microsoft SQL Server, or Microsoft
Access
• mySQL is an example of open-source DBMS

• Less expensive than proprietary packages


• Source code available, for modification

8
Data Modeling Responsibilities

9
Database Security
• Database Security: Protection of the
data against accidental or intentional
loss, destruction, or misuse

• Increased difficulty due to Internet access


and client/server technologies

10
Possible locations of data security threats

11
Threats to Data Security
• Accidental losses attributable to:
– Human error
– Software failure
– Hardware failure
• Theft and fraud
• Improper data access:
– Loss of privacy (personal data)
– Loss of confidentiality (corporate data)
• Loss of data integrity
• Loss of availability (through, e.g. sabotage)
12
Data Management Software
Security Features
• Views or subschemas

• Integrity controls

• Authorization rules

• User-defined procedures

• Encryption

• Authentication schemes

• Backup, journalizing, and checkpointing 13


Views and Integrity Controls
• Views
– Subset of the database that is presented to one or
more users
– User can be given access privilege to view without
allowing access privilege to underlying tables

• Integrity Controls
– Protect data from unauthorized use
– Domains – set allowable values
– Assertions – enforce database conditions 14
Authorization Rules
• Controls incorporated in the data
management system
• Restrict:
– Access to data
– Actions that people can take on data

• Authorization matrix for:


– Subjects
– Objects
– Actions
– Constraints
15
Authorization matrix

16
Authorization table for subjects

Authorization table for objects

Oracle9i privileges

Some DBMSs also provide


capabilities for user-
defined procedures to
customize the authorization
process 17
Basic two-Key Encryption

Encryption – the coding or


scrambling of data so that
humans cannot read them

18
Authentication Schemes (1)
• Goal – obtain a positive identification of the
user
• Passwords: First line of defense
– Should be at least 8 characters long
– Should combine alphabetic and numeric
data
– Should not be complete words or personal
information
– Should be changed frequently
19
Authentication Schemes (2)
• Strong Authentication
– Passwords are flawed:
• Users share them with each other

• They get written down, could be copied


• Automatic logon scripts remove need to
explicitly type them in
• Unencrypted passwords travel the Internet

20
Authentication Schemes (3)
• Possible solutions:
– Two factor – e.g. smart card plus PIN

– Three factor – e.g. smart card, biometric,


PIN
– Biometric devices – use of fingerprints,
retinal scans, etc. for positive ID
– Third-party authentication – using secret
keys, digital certificates 21
Security Policies and Procedures
• Personnel controls
– Hiring practices, employee monitoring, security
training
• Physical access controls
– Equipment locking, check-out procedures, screen
placement
• Maintenance controls
– Maintenance agreements, access to source code,
quality and availability standards
• Data privacy controls
– Adherence to privacy legislation, access rules
22
Database Recovery
 Mechanism for restoring a database quickly
and accurately after loss or damage
 Recovery facilities:

− Backup Facilities

− Journalizing Facilities

− Checkpoint Facility

− Recovery Manager

23
Backup Facilities
• Automatic dump facility that produces
backup copy of the entire database
• Periodic backup (e.g. nightly, weekly)
• Cold backup – database is shut down
during backup
• Hot backup – selected portion is shut down
and backed up at a given time
• Backups stored in secure, off-site location
24
Journalizing Facilities
• Audit trail of transactions and database
updates
• Transaction log – record of essential data for
each transaction processed against the
database
• Database change log – images of updated
data
– Before-image – copy before modification
– After-image – copy after modification

Produces an audit trail 25


Database audit trail

From the backup and logs,


databases can be restored in
case of damage or loss 26
Checkpoint Facilities
• DBMS periodically refuses to accept new
transactions
 system is in a quiet state
• Database and transaction logs are
synchronized

This allows recovery manager to resume


processing from short period, instead of
repeating entire day

27
Recovery and Restart Procedures
• Switch - Mirrored databases
• Restore/Rerun - Reprocess transactions
against the backup
• Transaction Integrity - Commit or abort all
transaction changes
• Backward Recovery (Rollback) - Apply before
images
• Forward Recovery (Roll Forward) - Apply after
images (preferable to restore/rerun)
28
Basic recovery techniques Rollback

29
Rollforward

30
Database Failure Responses
• Aborted transactions
– Preferred recovery: rollback
– Alternative: Rollforward to state just prior to abort
• Incorrect data
– Preferred recovery: rollback
– Alternative 1: rerun transactions not including inaccurate data
updates
– Alternative 2: compensating transactions
• System failure (database intact)
– Preferred recovery: switch to duplicate database
– Alternative 1: rollback
– Alternative 2: restart from checkpoint
• Database destruction
– Preferred recovery: switch to duplicate database
– Alternative 1: rollforward
– Alternative 2: reprocess transactions 31
Concurrency Control
• Problem – in a multiuser environment,
simultaneous access to data can result in
interference and data loss

• Solution – Concurrency Control


– The process of managing simultaneous operations
against a database so that data integrity is
maintained and the operations do not interfere with
each other in a multi-user environment

32
Lost Update

Simultaneous access causes updates to cancel each other


33
A similar problem is the inconsistent read problem
Concurrency Control Techniques
• Serializability
– Finish one transaction before starting another

• Locking Mechanisms
– The most common way of achieving serialization
– Data that is retrieved for the purpose of updating is
locked for the updater
– No other user can perform update until unlocked

34
Updates with locking for concurrency control

35
This prevents the lost update problem
Locking Mechanisms
• Locking level:
– Database – used during database updates
– Table – used for bulk updates
– Block or page – very commonly used
– Record – only requested row; fairly commonly
used
– Field – requires significant overhead; impractical
• Types of locks:
– Shared lock - Read but no update permitted.
Used when just reading to prevent another user
from placing an exclusive lock on the record
– Exclusive lock - No access permitted. Used when
preparing to update 36
Deadlock
An impasse that results when two or more transactions have
locked common resources, and each waits for the other to unlock
their resources
A deadlock situation

UserA and UserB


will wait forever
for each other to
release their
locked resources! 37
Managing Deadlock
• Deadlock prevention:
– Lock all records required at the beginning of a
transaction
– Two-phase locking protocol
• Growing phase
• Shrinking phase
– May be difficult to determine all needed
resources in advance
• Deadlock Resolution:
– Allow deadlocks to occur
– Mechanisms for detecting and breaking them
• Resource usage matrix
38
Versioning
• Optimistic approach to concurrency
control
• Instead of locking
• Assumption is that simultaneous updates
will be infrequent
• Each transaction can attempt an update
as it wishes
• The system will reject an update when it
senses a conflict
• Use of rollback and commit for this
39
The use of versioning

40
Better performance than locking
Managing Data Quality
• Data Steward - Liaisons between IT and business
units
• Five Data Quality Issues:
 Security policy and disaster recovery

 Personnel controls

 Physical access controls

 Maintenance controls (hardware and software)

 Data protection and privacy 41


Data Dictionaries and Repositories
• Data dictionary
– Documents data elements of a database
• System catalog
– System-created database that describes all database objects
• Information Repository
– Stores metadata describing data and data processing
resources
• Information Repository Dictionary System (IRDS)
– Software tool managing/controlling access to information
repository
42
Three components
of the repository system architecture
A schema of the
repository information

Software that manages the


repository objects

Where repository
objects are stored
43
Database Performance Tuning
• DBMS Installation
– Setting installation parameters
• Memory Usage
– Set cache levels
– Choose background processes
• Input/Output Contention
– Use striping
– Distribution of heavily accessed files
• CPU Usage
– Monitor CPU load
• Application tuning
– Modification of SQL code in applications

44

You might also like