0% found this document useful (0 votes)
12 views

ADBMS Lec5

Uploaded by

paribesh Karki
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views

ADBMS Lec5

Uploaded by

paribesh Karki
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 61

Advanced Data Based Management System

Sunil Paudel
[email protected]

1
Transaction Management
Transaction Concept
 A transaction is a unit of program execution that
accesses and possibly updates various data items.
 A transaction is any one execution of a user program
in a DBMS
 A transaction is a series of reads and writes of
database objects
 Transaction states
 Active transaction
 Partially committed transaction
 Committed transaction
 Failed transaction
 Aborted transaction
Example of Fund Transfer
 Transaction to transfer $50 from account A to account B:
1. read(A)
2. A := A – 50
3. write(A)
4. read(B)
5. B := B + 50
6. write(B)

 Two main issues to deal with:


 Failures of various kinds, such as hardware failures and system
crashes
 Concurrent execution of multiple transactions
Example of Fund Transfer (Cont.)
 Atomicity requirement
 If the transaction fails after step 3 and before step 6,
1. read(A)
2. A := A – 50
3. write(A)
4. read(B)
5. B := B + 50
6. write(B)
 money will be “lost” leading to an inconsistent database state, Failure
could be due to software or hardware
 The system should ensure that updates of a partially executed
transaction are not reflected in the database
 Durability requirement
 Once the user has been notified that the transaction has completed
(i.e., the transfer of the $50 has taken place), the updates to the
database by the transaction must persist even if there are software or
hardware failures.
5
Example of Fund Transfer (Cont.)
 Consistency requirement
 in previous example, the sum of A and B is
unchanged by the execution of the transaction
 In general, consistency requirements include
• Explicitly specified integrity constraints such as primary keys
and foreign keys
• Implicit integrity constraints
– e.g. sum of balances of all accounts, minus sum of loan
amounts must equal value of cash-in-hand

 A transaction must see a consistent database.


 During transaction execution the database may be
temporarily inconsistent.
 When the transaction completes successfully the
database must be consistent
Example of Fund Transfer (Cont.)
 Isolation requirement — if between steps 3 and 6,
another transaction T2 is allowed to access the
partially updated database, it will see an inconsistent
database (the sum A + B will be less than it should
be).
T1 T2
1. read(A)
2. A := A – 50
3. write(A)
read(A), read(B), print(A+B)
4. read(B)
5. B := B + 50
6. write(B)

 Isolation can be ensured by running transactions


serially , (one after the other).
ACID Properties
 To insure integrity of the data, DBMS maintains
following properties:
 Atomicity
• Either all operations of the transaction are reflected properly
in the database or none
 Consistency
• Any transaction transforms a correct state of db into another
correct state
 Isolation
• In case of multiple transactions executing concurrently, say
pair T 1 and T2, the system guarantees either T1 starts after
T2 complets OR T2 starts after T1 Complets
 Durability
• After Transaction completes successfully, changes it made in
db should persist, even in system crash
Transaction State
 Active – the initial state; the transaction stays in
this state while it is executing
 Partially committed – after the final statement has
been executed.
 Failed -- after the discovery that normal execution
can no longer proceed.
 Aborted – after the transaction has been rolled
back and the database restored to its state prior to
the start of the transaction. Two options after it has
been aborted:
 restart the transaction
 kill the transaction
 Committed – after successful completion.
Transaction State (Cont.)
Database Components
Transaction manager
Scheduler
Lock manager
Recovery manager
Buffer manager

11
Implementation of Atomicity and
Durability
 The recovery-management component of a database system
implements the support for atomicity and durability.
 E.g. the shadow-database scheme:
 all updates are made on a shadow copy of the database
• db_pointer is made to point to the updated shadow copy after
– the transaction reaches partial commit and
– all updated pages have been flushed to disk.
Implementation of Atomicity and
Durability (Cont.)
 db_pointer always points to the current
consistent copy of the database.
 In case transaction fails, old consistent copy pointed
to by db_pointer can be used, and the shadow copy
can be deleted.
 The shadow-database scheme:
 Assumes that only one transaction is active at a
time.
 Assumes disks do not fail
 Does not handle concurrent transactions
Concurrent Executions
 Multiple transactions are allowed to run
concurrently in the system.
 Advantages are:
 increased processor and disk utilization, leading to
better transaction throughput
• E.g. one transaction can be using the CPU while another is
reading from or writing to the disk
 reduced average response time for transactions: short
transactions need not wait behind long ones.
 Concurrency control schemes – mechanisms to
achieve isolation
 that is, to control the interaction among the concurrent
transactions in order to prevent them from destroying the
consistency of the database
Schedules
 A sequences of instructions that specify the chronological
order in which instructions of concurrent transactions are
executed
 a schedule for a set of transactions must consist of all instructions
of those transactions
 must preserve the order in which the instructions appear in each
individual transaction.
 A transaction that successfully completes its execution will
have a commit instructions as the last statement
 by default transaction assumed to execute commit instruction as
its last step
 A transaction that fails to successfully complete its
execution will have an abort instruction as the last
statement
Schedule 1
 Let T1 transfer $50 from A to B, and T2 transfer 10% of the balance from A to
B.
 A serial schedule in which T is followed by T :
1 2
Schedule 2
• A serial schedule where T2 is followed by T1
Schedule 3
 Let T1 and T2 be the transactions defined previously. The following
schedule is not a serial schedule, but it is equivalent to Schedule 1.

In Schedules 1, 2 and 3, the sum A + B is preserved.


Schedule 4
 The following concurrent schedule does not preserve
the value of (A + B ).
Serializability
 Basic Assumption – Each transaction preserves
database consistency.
 Thus serial execution of a set of transactions
preserves database consistency.
 A (possibly concurrent) schedule is serializable if
it is equivalent to a serial schedule
 Different forms of schedule equivalence give rise
to the notions of:
1. conflict serializability
2. view serializability
Serializability- Definition
 A serializable schedule over a set S of committed
transactions is a schedule whose effect on any consistent
database instance is guaranteed to be identical to that of
some complete serial schedule over S. That is, the database
instance that results from executing the given schedule is
identical to the database instance that results from executing
the transactions in some serial order.
There are some important points to note in this definition:
 Executing the transactions serially in different orders may
produce different results, but all are presumed to be acceptable;
the DBMS makes no guarantees about which of them will be the
outcome of an interleaved execution.
 The above definition of a serializable schedule does not cover
the case of schedules containing aborted transactions. For
simplicity, we begin by discussing interleaved execution of a set
of complete, committed transactions 21
Simplified view of transactions
 We ignore operations other than read and
write instructions
 We assume that transactions may perform
arbitrary computations on data in local buffers
in between reads and writes.
 Our simplified schedules consist of only read
and write instructions.

22
Conflicting Instructions
 Instructions li and lj of transactions Ti and Tj
respectively, conflict if and only if there exists some
item Q accessed by both li and lj, and at least one of
these instructions wrote Q.
1. li = read(Q), lj = read(Q). li and lj don’t conflict.
2. li = read(Q), lj = write(Q). They conflict.
3. li = write(Q), lj = read(Q). They conflict
4. li = write(Q), lj = write(Q). They conflict
 Intuitively, a conflict between li and lj forces a
(logical) temporal order between them.
 If li and lj are consecutive in a schedule and they do not
conflict, their results would remain the same even if they
had been interchanged in the schedule.
Conflict Serializability
 If a schedule S can be transformed into a
schedule S´ by a series of swaps of non-
conflicting instructions, we say that S and S´
are conflict equivalent.
 We say that a schedule S is conflict
serializable if it is conflict equivalent to a
serial schedule
Conflict Serializability (Cont.)
 Schedule 3 can be transformed into Schedule 6, a serial
schedule where T2 follows T1, by series of swaps of non-
conflicting instructions.
 Therefore Schedule 3 is conflict serializable.

Schedule 3 Schedule 6
View Serializability
 Let S and S´ be two schedules with the same set of
transactions. S and S´ are view equivalent if the
following three conditions are met, for each data item Q,
1. If in schedule S, transaction Ti reads the initial value of Q, then
in schedule S’ also transaction Ti must read the initial value of
Q.
2. If in schedule S transaction Ti executes read(Q), and that value
was produced by transaction Tj (if any), then in schedule S’
also transaction Ti must read the value of Q that was produced
by the same write(Q) operation of transaction Tj .
3. The transaction (if any) that performs the final write(Q)
operation in schedule S must also perform the final write(Q)
operation in schedule S’.
View Serializability (Cont.)
 A schedule S is view serializable if it is view
equivalent to a serial schedule.
 Every conflict serializable schedule is also
view serializable.
 Below is a schedule which is view-serializable
but not conflict serializable.
What serial schedule is above equivalent to?
 Every view serializable schedule that is not conflict
serializable has blind writes.
Other Notions of Serializability
 The schedule below produces same outcome as the
serial schedule < T1, T5 >, yet is not conflict equivalent
or view equivalent to it.
 Determining such equivalence requires analysis of
operations other than read and write.
Recovery

30
Recovery
 Recovery means to restore the database to a
correct state after some failure has rendered the
current state incorrect or suspect
 Recovery is based on redundancy
 To recover a database, the source for the
recovery must be information that has been
stored redundantly somewhere else
Failure Classification
 Transaction failure :
 Logical errors: transaction cannot complete due to some
internal error condition
 System errors: the database system must terminate an active
transaction due to an error condition (e.g., deadlock)
 System crash: a power failure or other hardware or
software failure causes the system to crash.
 Fail-stop assumption: non-volatile storage contents are
assumed to not be corrupted by system crash
• Database systems have numerous integrity checks to prevent
corruption of disk data
 Disk failure: a head crash or similar disk failure destroys
all or part of disk storage
 Destruction is assumed to be detectable: disk drives use
checksums to detect failures
Recovery Algorithms
 Recovery algorithms are techniques to ensure
database consistency and transaction atomicity
and durability despite failures
 Focus of this chapter
 Recovery algorithms have two parts
1. Actions taken during normal transaction processing
to ensure enough information exists to recover from
failures
2. Actions taken after a failure to recover the database
contents to a state that ensures atomicity,
consistency and durability
Storage Structure
 Volatile storage:
 does not survive system crashes
 examples: main memory, cache memory
 Nonvolatile storage:
 survives system crashes
 examples: disk, tape, flash memory,
non-volatile (battery backed up) RAM
 Stable storage:
 a mythical form of storage that survives all failures
 approximated by maintaining multiple copies on
distinct nonvolatile media
Stable-Storage Implementation
 Maintain multiple copies of each block on separate disks
 copies can be at remote sites to protect against disasters such as
fire or flooding.
 Failure during data transfer can still result in inconsistent
copies: Block transfer can result in
 Successful completion
 Partial failure: destination block has incorrect information
 Total failure: destination block was never updated
 Protecting storage media from failure during data transfer
(one solution):
 Execute output operation as follows (assuming two copies of each
block):
1. Write the information onto the first physical block.
2. When the first write successfully completes, write the same information
onto the second physical block.
3. The output is completed only after the second write successfully
completes.
Stable-Storage Implementation (Cont.)
 Protecting storage media from failure during data transfer
(cont.):
 Copies of a block may differ due to failure during output
operation. To recover from failure:
1. First find inconsistent blocks:
1. Expensive solution: Compare the two copies of every disk block.
2. Better solution:
 Record in-progress disk writes on non-volatile storage (Non-volatile RAM or
special area of disk).
 Use this information during recovery to find blocks that may be inconsistent,
and only compare copies of these.
 Used in hardware RAID systems
2. If either copy of an inconsistent block is detected to have an error
(bad checksum), overwrite it by the other copy. If both have no
error, but are different, overwrite the second block by the first block.
Data Access
 Physical blocks are those blocks residing on the disk.
 Buffer blocks are the blocks residing temporarily in main
memory.
 Block movements between disk and main memory are
initiated through the following two operations:
 input(B) transfers the physical block B to main memory.
 output(B) transfers the buffer block B to the disk, and replaces the
appropriate physical block there.
 Each transaction Ti has its private work-area in which local
copies of all data items accessed and updated by it are
kept.
 Ti's local copy of a data item X is called xi.
Data Access (Cont.)
 Transaction transfers data items between system buffer blocks
and its private work-area using the following operations :
 read(X) assigns the value of data item X to the local variable xi.
 write(X) assigns the value of local variable xi to data item {X} in the
buffer block.
 both these commands may necessitate the issue of an input(BX)
instruction before the assignment, if the block BX in which X resides is
not already in memory.
 Transactions
 Perform read(X) while accessing X for the first time;
 All subsequent accesses are to the local copy.
 After last access, transaction executes write(X).
 output(BX) need not immediately follow write(X). System can
perform the output operation when it deems fit.
Example of Data Access
buffer
Buffer Block A input(A)
X A
Buffer Block B Y B
output(B)
read(X)
write(Y)

x2
x1
y1

work area work area


of T1 of T2
memory disk
Recovery and Atomicity
 Modifying the database without ensuring that the
transaction will commit may leave the database
in an inconsistent state.
 Consider transaction Ti that transfers $50 from
account A to account B; goal is either to perform
all database modifications made by Ti or none at
all.
 Several output operations may be required for Ti
(to output A and B). A failure may occur after one
of these modifications have been made but
before all of them are made.
Recovery and Atomicity (Cont.)
 To ensure atomicity despite failures, we first
output information describing the modifications
to stable storage without modifying the database
itself.
 There are two approaches:
 log-based recovery, and
 shadow-paging
Recovery Log
 A recovery log or journal keeps the before and
after state for each transaction
 An active (online) log is kept for immediate
recovery of recent activity
 An archive log is kept offline for more extensive
recovery requirements
Log-Based Recovery
 A log is kept on stable storage.
 The log is a sequence of log records, and maintains a record of
update activities on the database.
 When transaction Ti starts, it registers itself by writing a
<Ti start>log record
 Before Ti executes write(X), a log record <Ti, X, V1, V2>
is written, where V1 is the value of X before the write, and
V2 is the value to be written to X.
 Log record notes that Ti has performed a write on data item Xj Xj
had value V1 before the write, and will have value V2 after the
write.
 When Ti finishes it last statement, the log record <Ti
commit> is written.
Shadow Paging
 Shadow paging is an alternative to log-based recovery;
this scheme is useful if transactions execute serially
 Store the shadow page table in nonvolatile storage, such
that state of the database prior to transaction execution
may be recovered.
 Shadow page table is never modified during execution
 To start with, both the page tables are identical. Only
current page table is used for data item accesses during
execution of the transaction.
 Whenever any page is about to be written for the first time
 A copy of this page is made onto an unused page.
 The current page table is then made to point to the copy
 The update is performed on the copy
Transaction Recovery
 ROLLBACK will return the database to the
previous COMMIT point
 In large multiprocessing environments,
transactions can “steal” buffer space from their
predecessors, which can cause early disk
writing
 Similarly, the DBMS can use a “no force” policy,
meaning that writing to disk is held until
additional transactions complete
System Recovery
 The system takes checkpoints automatically
 Upon system restart after a crash, transactions
that finished successfully prior to the crash are
redone, and those that were not complete prior
to the crash are undone
 REDO and UNDO logs
Media Recovery
 Disk failure can corrupt the persistent database
 The database must be restored from backup
 The transaction logs can be used to roll forward
from the backup point, to recover as much of the
recent transaction history as possible
Concurrency Control
Concurrency Problems
 In a multi-processing environment transactions
can interfere with each other
 Three concurrency problems can arise, that any
DBMS must account for and avoid:
1. Lost Updates
2. Uncommitted Dependency
3. Inconsistent Analysis
Concurrency Control

T1 T2 … Tn How to prevent
harmful interference
btw transactions?

DB => scheduling
(consistency techniques based on
constraints) - locks
- timestamps and
validation

DBMS 2001
Concurrency Problems –Description
 A lost update occurs when a second transaction
reads the state of the database prior to the first
one writing a change, and then stomps on the
first one’s change with its own update
 An uncommitted dependency occurs when a
second transaction relies on a change which has
not yet been committed, which is rolled back
after the second transaction has begun
 An inconsistent analysis occurs when totals are
calculated during interleaved updates
Locking
A transaction locks a portion of the
database to prevent concurrency problems
Exclusive lock – write lock, will lock out all
other transactions
Shared lock – read lock, will lock out
writes, but allow other reads
Lock-Based Protocols
 A lock is a mechanism to control concurrent access to a
data item
 Data items can be locked in two modes :
1. exclusive (X) mode. Data item can be both read as well
as written. X-lock is requested using lock-X instruction.
2. shared (S) mode. Data item can only be read. S-lock is

requested using lock-S instruction.


 Lock requests are made to concurrency-control manager.
Transaction can proceed only after request is granted.
Lock-Based Protocols (Cont.)
 Example of a transaction performing locking:
T2: lock-S(A);
read (A);
unlock(A);
lock-S(B);
read (B);
unlock(B);
display(A+B)
 Locking as above is not sufficient to guarantee
serializability — if A and B get updated in-between the
read of A and B, the displayed sum would be wrong.
 A locking protocol is a set of rules followed by all
transactions while requesting and releasing locks.
Pitfalls of Lock-Based Protocols
 Consider the partial schedule

 Neither T3 nor T4 can make progress — executing lock-


S(B) causes T4 to wait for T3 to release its lock on B,
while executing lock-X(A) causes T3 to wait for T4 to
release its lock on A.
 Such a situation is called a deadlock.
 To handle a deadlock one of T3 or T4 must be rolled back
Deadlock Handling
 Deadlock prevention protocols ensure that the
system will never enter into a deadlock state.
Some prevention strategies :
 Require that each transaction locks all its data items
before it begins execution (predeclaration).
 Impose partial ordering of all data items and require
that a transaction can lock data items only in the
order specified by the partial order.

56
More Deadlock Prevention
Strategies
 Following schemes use transaction timestamps for
the sake of deadlock prevention alone.
 wait-die scheme — non-preemptive
 older transaction may wait for younger one to
release data item. Younger transactions never wait
for older ones; they are rolled back instead.
 a transaction may die several times before
acquiring needed data item
 wound-wait scheme — preemptive
 older transaction wounds (forces rollback) of
younger transaction instead of waiting for it.
Younger transactions may wait for older ones.
 may be fewer rollbacks than wait-die scheme.
Pitfalls of Lock-Based Protocols (Cont.)
 Starvation is also possible if concurrency
control manager is badly designed. For
example:
 A transaction may be waiting for an X-lock on an
item, while a sequence of other transactions request
and are granted an S-lock on the same item.
 The same transaction is repeatedly rolled back due
to deadlocks.
 Concurrency control manager can be designed
to prevent starvation.
The Two-Phase Locking Protocol
 This is a protocol which ensures conflict-serializable
schedules.
 Phase 1: Growing Phase
 transaction may obtain locks
 transaction may not release locks
 Phase 2: Shrinking Phase
 transaction may release locks
 transaction may not obtain locks
 The protocol assures serializability. It can be proved that
the transactions can be serialized in the order of their lock
points (i.e. the point where a transaction acquired its final
lock).
Implementation of Locking
 A lock manager can be implemented as a separate
process to which transactions send lock and unlock
requests
 The lock manager replies to a lock request by sending a
lock grant messages (or a message asking the
transaction to roll back, in case of a deadlock)
 The requesting transaction waits until its request is
answered
 The lock manager maintains a data-structure called a
lock table to record granted locks and pending requests
 The lock table is usually implemented as an in-memory
hash table indexed on the name of the data item being
locked
Remaining Topics
1. Physical database administration
2. Database security and authorization
3. Case Study ( Oracle and MS SQL)
4. Parallel and Distributed Databases
( including Internet Database)
5. Object Oriented Database
6. Data Mining

62

You might also like