13 Recovery
13 Recovery
• Whatever the cause of failure, there are two main effects that must be
considered:
– The loss of main memory, including database buffers
Slide 1 – The loss of the disk copy of the database. Slide 2
Slide 5 Slide 6
1
Use of REDO/UNDO Use of UNDO/REDO
• Consider a number of concurrently executing transactions, T1…T6. • Clearly, T1 and T6 had not committed at the time of the crash.
• Therefore, at restart, the recovery manager must undo all changes made by T1
and T6.
T1 • However, it is not clear to what extent the changes made by the committed
T2 transactions have been propagated to the database on disk.
T3 • There is no way to know whether or not the volatile database buffers were
written (flushed) to disk before the crash occurred.
T4
• In the absence of any other information, the recovery manager is forced to redo
T5 (roll-forward) the (committed) transactions T2…T5: Why?
T6
– To ensure durability!
DBMS failure
starts occurs – however, see more, later, on this (use of checkpointing)!
Slide 7 Slide 8
– Logging facilities, which keep track of the current state of transactions and • The log contains information about all updates to the database.
database updates.
• The log primarily contains transaction records.
– A checkpoint facility, which enables updates to the database which are in
progress to be made permanent.
• A transaction record consists of:
– A transaction identifier.
– A recovery manager, which allows the system to restore the database to a
consistent state following a failure. – A record type (transaction start, insert, update, delete, abort, commit).
• The backup mechanism should allow the database and the log file to be – The identity of the affected data item (for insert, delete, and update).
archived at regular intervals without having to stop the system.
– Before-image of the data item (its value before an update or delete).
• The backup archive copies can be used to recover from severe failures – After-image of the data item (its value after an update or insert).
in which the storage media are damaged.
– Log management information.
Slide 9 Slide 10
2
Checkpointing-II Use of REDO/UNDO with Checkpointing
• Checkpoints are scheduled at pre-determined intervals and involve the • Consider, again, the example from before:
following operations:
– Writing all log records in main memory out to disk. T1
T2
T3
transactions that are active at the time of the checkpoint (see previous T4
example)
T5
• When a crash occurs, the recovery manager examines the log file for the T6
last checkpoint record.
DBMS failure
– All transactions that have committed since (i.e. after) the last checkpoint are checkpoint
starts occurs
• When failure occurs, we can assume that T2 and T3 are permanently recorded
redone
– Any transactions active at the time of the crash are undone. • However, T1 and T6 must be undone (since they were active at the time of the
• Since checkpoints are relatively inexpensive, it is often possible to take crash), and T4 and T5 must be redone (since they committed after the
checkpoint i.e. their updates will not have reached the disk).
3 or 4 per hour
– This way, no more than 15-20 minutes work is lost. Slide 13 Slide 14
– Immediate update
Slide 15 Slide 16
Slide 17 Slide 18
3
Immediate Update Log Usage
• If a transaction aborts, the log can be used to undo it using the saved before-
images.
– Since a transaction may change an item several times, the changes made by an aborted transaction
are undone in reverse order.
– This guarantees that the database is restored to the state before the transaction started.
Slide 19