0% found this document useful (0 votes)
7 views40 pages

Unit 4

Fourth unit of ddbms

Uploaded by

Aditi Pandey
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views40 pages

Unit 4

Fourth unit of ddbms

Uploaded by

Aditi Pandey
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 40

' Chapter 20: Advanced Transaction Processing

$
• Remote Backup Systems
• Transaction-Processing Monitors
• High-Performance Transaction Systems
• Long-Duration Transactions
• Real-Time Transaction Systems
• Weak Levels of Consistency

& %
• Transactional Workflows

Database Systems Concepts 20.1 Silberschatz, Korth and Sudarshan c 1997


' Remote Backup Systems
$
• Remote backup systems provide high availability by allowing
transaction processing to continue even if the primary site is
destroyed.
• Transaction processing is faster than in a replicated distributed
system.

& %
primary network backup

log
records

Database Systems Concepts 20.2 Silberschatz, Korth and Sudarshan c 1997


' Remote Backup Systems (Cont.)
$
• Backup site must detect when the primary site has failed
– to distinguish primary site failure from link failure maintain
several communication links between the primary and the
remote backup.
• To take over control backup site first performs recovery using
its copy of the database and all the log records it has received
from the primary. Thus, completed transactions are redone
and incomplete transactions are rolled back.
• When backup site takes over processing it becomes the new

& %
primary
• To reduce delay in takeover, backup site periodically processes
the redo log records (in effect, performing recovery from
previous database state), performs a checkpoint, and can then
delete earlier parts of the log.

Database Systems Concepts 20.3 Silberschatz, Korth and Sudarshan c 1997


' Remote Backup Systems (Cont.)
$
• To transfer control back to old primary when it recovers, the old
primary must receive redo logs from the old backup and apply
all updates locally.
• Hot-spare configuration permits very fast takeover:
– Backup continually processes redo log records as they
arrive, applying the updates locally.
– When failure of the primary is detected, the backup rolls
back incomplete transactions, and is ready to process new
transactions.

&
Database Systems Concepts 20.4 Silberschatz, Korth and Sudarshan c 1997
%
' Remote Backup Systems (Cont.)
$
• Ensure durability of updates by delaying transaction commit
until update is logged at backup; avoid this delay by permitting
lower degrees of durability.
• One-safe commits as soon as transaction’s commit log record
is written at primary — updates may not arrive at backup
before it has to take over.
• Two-very-safe commits when transaction’s commit log record
is written at primary and backup — reduces availability since
transactions cannot commit if either site fails.

& %
• Two-safe proceeds as in two-very-safe if both primary and
backup are active. If only the primary is active, the transaction
commits as soon as is commit log record is written at the
primary. Better availability than two-very-safe; avoids problem
of lost transactions in one-safe.

Database Systems Concepts 20.5 Silberschatz, Korth and Sudarshan c 1997


' Transaction Processing Monitors
$
• TP monitors initially developed as multithreaded servers to
support large numbers of terminals from a single process.
• Provide infrastructure for building and administering complex
transaction processing systems with a large number of clients
and multiple servers.
• Provide services such as:
– Presentation facilities to simplify writing user interface
applications
– Persistent queuing of client requests and server responses
– Routing of client messages to servers

& %
– Coordination of two-phase commit when transactions
access multiple servers.
• Some commercial TP Monitors: CICS from IBM, Pathway from
Tandem, Top End from NCR, and Encina from Transarc

Database Systems Concepts 20.6 Silberschatz, Korth and Sudarshan c 1997


' TP Monitor Architectures
$
remote server files remote server files
clients clients
(a) Process-per-client model (b) Single-process model

monitor

& %
remote router servers files remote routers servers files
clients clients
(c) Many-server, single-router model (d) Many-server, many-router model

Database Systems Concepts 20.7 Silberschatz, Korth and Sudarshan c 1997


' TP Monitor Architectures (Cont.)
$
• Process per client model – instead of individual login session
per terminal, server process communicates with the terminal,
handles authentication, and executes actions.
– Memory requirements are high
– Multitasking – high CPU overhead for context switching
between processes
• Single process model – all remote terminals connect to a
single server process.
– Used in client-server environments

& %
– Server process is multi-threaded; low cost for thread
switching
– No protection between applications
– Not suited for parallel or distributed databases

Database Systems Concepts 20.8 Silberschatz, Korth and Sudarshan c 1997


' TP Monitor Architectures (Cont.)
$
• Many-server single-router model – multiple application
server processes access a common database; clients
communicate with the application through a single
communication process that routes requests.
– Independent server processes for multiple applications
– Multithreaded server process
– Run on parallel or distributed database
• Many-server many-router model – multiple processes
communicate with clients.

& %
– Client communication processes interact with router
processes that route their requests to the appropriate
server.
– Controller process starts up and supervises other
processes.

Database Systems Concepts 20.9 Silberschatz, Korth and Sudarshan c 1997


' Detailed Structure of a TP Monitor
$
input queue

authorization

lock manager

recovery manager
application
servers
log manager

& %
database and
resource managers

output queue
network

Database Systems Concepts 20.10 Silberschatz, Korth and Sudarshan c 1997


' Detailed Structure of a TP Monitor (Cont.)
$
• Queue manager handles incoming messages
• Some queue managers provide persistent or durable queueing
of messages – even if system crashes, contents of queue are
not lost.
• Many TP monitors provide locking, logging and recovery
services, to enable application servers to implement ACID
properties by themselves
• Durable queueing of outgoing messages is important
– application server writes message to durable queue as part

& %
of a transaction
– once the transaction commits, the TP monitor guarantees
message is eventually delivered, regardless of crashes.
– ACID properties are thus provided even for messages sent
outside the database

Database Systems Concepts 20.11 Silberschatz, Korth and Sudarshan c 1997


' Application Coordination Using TP Monitors
$
• A TP monitor treats each subsystem as a resource manager
that provides transactional access to some set of resources.
• The interface between the TP monitor and the resource
manager is defined by a set of transaction primitives.
• The resource manager interface is defined by the X/Open
Distributed Transaction Processing standard.
• TP monitor systems provide a transactional RPC interface to
their services; the RPC (Remote Procedure Call) mechanism

& %
provides calls to enclose a series of RPC calls within a
transaction.
• Updates performed by an RPC are carried out within the scope
of the transaction, and can be rolled back if there is any failure.

Database Systems Concepts 20.12 Silberschatz, Korth and Sudarshan c 1997


' High-Performance Transaction Systems
$
• High-performance hardware and parallelism help improve the
rate of transaction processing, but are insufficient to obtain
high performance:
– Disk I/O is a bottleneck— I/O time (10 milliseconds) has not
decreased at a rate comparable to the increase in
processor speeds.
– Parallel transactions may attempt to read or write the same
data item, resulting in data conflicts that reduce effective

& %
parallelism.
• We can reduce the degree to which a database system is disk
bound by increasing the size of the database buffer.

Database Systems Concepts 20.13 Silberschatz, Korth and Sudarshan c 1997


' Main-Memory Databases
$
• Commercial 64-bit systems can support main memories of
tens of gigabytes.
• Memory resident data allows faster processing of transactions.
• Disk-related limitations:
– Logging is a bottleneck when transaction rate is high.
Use group-commit to reduce number of output operations.
(Will study two slides ahead.)

& %
– If the update rate for modified buffer blocks is high, the disk
data-transfer rate could become a bottleneck.
– If the system crashes, all of main memory is lost.

Database Systems Concepts 20.14 Silberschatz, Korth and Sudarshan c 1997


' Main-Memory Database Optimizations
$
• To reduce space overheads, main-memory databases can use
structures with pointers crossing multiple pages. In disk
databases, the I/O cost to traverse multiple pages would be
excessively high.
• No need to pin buffer pages in memory before data are
accessed, since buffer pages will never be replaced.
• Design query-processing techniques to minimize space
overhead — avoid exceeding main memory limits during query
evaluation.

& %
• Improve implementation of operations such as locking and
latching, so they do not become bottlenecks.
• Optimize recovery algorithms, since pages rarely need to be
written out to make space for other pages.

Database Systems Concepts 20.15 Silberschatz, Korth and Sudarshan c 1997


' Group Commit
$
• Idea: instead of performing output of log records to stable
storage as soon as a transaction is ready to commit, wait until
– log buffer block is full (with log records of further
transactions), or
– a transaction has been waiting sufficiently long after being
ready to commit
before performing output of log buffer block.
• Results in fewer output operations per committed transaction,
and correspondingly a higher throughput.
• Commits are delayed until a sufficiently large group of transact-

& %
ions are ready to commit, or a transaction has been waiting
long enough – leads to slightly increased response time.
• The delay is acceptable in high-performance transaction
systems since it does not take much time for a large enough
group of transactions to be ready to commit.

Database Systems Concepts 20.16 Silberschatz, Korth and Sudarshan c 1997


' Long-Duration Transactions
$
Traditional concurrency control techniques do not work well when
user interaction is required:

• Long duration: Design edit sessions are very long


• Exposure of uncommitted data: E.g., partial update to a design
• Subtasks: support partial rollbacks
• Recoverability: on crash state should be restored even for
yet-to-be committed data, so user work is not lost

& %
• Performance: fast response time is essential so user time is
not wasted

Database Systems Concepts 20.17 Silberschatz, Korth and Sudarshan c 1997


' Long-Duration Transactions
$
• Represented as a nested transaction with atomic database
operations (read/write) at the lowest level.
• If a transaction fails, only active short-duration transactions
abort.
• Active long-duration transactions resume once any short
duration transactions have recovered.
• The efficient management of long-duration interactive

& %
transactions is more complex because of the long-duration
waits, and the possibility of aborts.
• Need alternatives to waits and aborts; alternative techniques
must ensure correctness without requiring serializability.

Database Systems Concepts 20.18 Silberschatz, Korth and Sudarshan c 1997


' Concurrency Control
$
• Correctness without serializability:
– Correctness depends on the specific consistency
constraints for the database.
– Correctness depends on the properties of operations
performed by each transaction.
• Use database consistency constraints as to split the database
into subdatabases on which concurrency can be managed
separately.

& %
• Treat some operations besides read and write as fundamental
low-level operations and extend concurrency control to deal
with them.

Database Systems Concepts 20.19 Silberschatz, Korth and Sudarshan c 1997


' Concurrency Control (Cont.)
$
A non–conflict-serializable schedule that preserves the sum of
A + B (Figure 20.4).
T1 T2
read(A)
A := A − 50
write(A)
read(B)
B := B − 10
write(B)
read(B)
B := B + 50

& %
write(B)
read(A)
A := A + 10
write(A)

Database Systems Concepts 20.20 Silberschatz, Korth and Sudarshan c 1997


' Nested and Multilevel Transactions
$
• A nested or multilevel transaction T is represented by a set
T = { t1 , t2 ,..., tn }

of subtransactions and a partial order P on T.


• A subtransaction ti in T may abort without forcing T to abort.
Instead, T may either restart ti or simply choose not to run ti .
• If ti commits, this action does not make ti permanent (unlike
the situation in Chapter 15). Instead, ti commits to T, and may
still abort (or require compensation) if T aborts.

& %
• An execution of T must not violate the partial order P, i.e., if an
edge ti → tj appears in the precedence graph, then tj → ti
must not be in the transitive closure of P.

Database Systems Concepts 20.21 Silberschatz, Korth and Sudarshan c 1997


' Nested and Multilevel Transactions (Cont.)
$
• Subtransactions can themselves be nested/multilevel
transactions. Lowest level of nesting: standard read and write
operations.
• Nesting can create higher-level operations that may enhance
concurrency.
• Types of nested/multilevel transactions:
– Multilevel transaction: subtransaction of T is permitted to
release locks on completion.

& %
– Saga: multilevel long-duration transaction.
– Nested transaction: locks held by a subtransaction ti of T
are automatically assigned to T on completion of ti .

Database Systems Concepts 20.22 Silberschatz, Korth and Sudarshan c 1997


' Example of Nesting
$
• Rewrite transaction T1 using subtransactions Ta and Tb that
perform increment or decrement operations:
– T1 consists of
∗ T1,1 , which subtracts 50 from A
∗ T1,2 , which adds 50 to B
• Rewrite transaction T2 using subtransactions Tc and Td that
perform increment or decrement operations:
– T2 consists of

& %
∗ T2,1 , which subtracts 10 from B
∗ T2,2 , which adds 10 to A
• No ordering is specified on subtransactions; any execution
generates a correct result.

Database Systems Concepts 20.23 Silberschatz, Korth and Sudarshan c 1997


' Compensating Transactions
$
• Alternative to undo operation; compensating transactions deal
with the problem of cascading rollbacks.
• Instead of undoing all changes made by the failed transaction,
action is taken to “compensate” for the failure.
• Consider a long-duration transaction Ti representing a travel
reservation, with subtransactions Ti,1 , which makes airline
reservations, Ti,2 which reserves rental cars, and Ti,3 which
reserves a hotel room.
– Hotel cancels the reservation.

& %
– Instead of undoing all of Ti , the failure of Ti,3 is
compensated for by deleting the old hotel reservation and
making a new one.
• Requires use of semantics of the failed transaction.

Database Systems Concepts 20.24 Silberschatz, Korth and Sudarshan c 1997


' Implementation Issues
$
• For long-duration transactions to survive system crashes, we
must log not only changes to the database, but also changes
to internal system data pertaining to these transactions.
• Logging of updates is made more complex by physically large
data items (CAD design, document text); undesirable to store
both old and new values.
• Two approaches to reducing the overhead of ensuring the
recoverability of large data items:
– Operation logging. Only the operation performed on the

& %
data item and the data-item name are stored in the log.
– Logging and shadow paging. Use logging from small
data items; use shadow paging for large data items. Only
modified pages need to be stored in duplicate.

Database Systems Concepts 20.25 Silberschatz, Korth and Sudarshan c 1997


' Real-Time Transaction Systems
$
• In systems with real-time constraints, correctness of execution
involves both database consistency and the satisfaction of
deadlines.
– Hard – The task has zero value if it is completed after the
deadline.
– Soft – The task has diminishing value if it is completed after
the deadline.

• The wide variance of execution times for read and write


operations on disks complicates the transaction management
problem for time-constrained systems; main-memory

& %
databases are thus often used.
• Design of a real-time system involves ensuring that enough
processing power exists to meet deadlines without requiring
excessive hardware resources.

Database Systems Concepts 20.26 Silberschatz, Korth and Sudarshan c 1997


' Weak Levels of Consistency
$
• Use alternative notions of consistency that do not ensure
serializability, to improve performance.
• Degree-two consistency avoids cascading aborts without
necessarily ensuring serializability.
– Unlike two-phase locking, S-locks may be released at any
time, and locks may be acquired at any time.
– X-locks cannot be released until the transaction either
commits or aborts.

&
Database Systems Concepts 20.27 Silberschatz, Korth and Sudarshan c 1997
%
' Example Schedule with Degree-Two Consistency
$
• Nonserializable schedule with degree-two consistency (Figure
20.5) where T3 reads the value of Q before and after that value
is written by T4 .
T3 T4
lock-S(Q)
read(Q)
unlock(Q)
lock-X(Q)
read(Q)
write(Q)

& %
unlock(Q)
lock-S(Q)
read(Q)
unlock(Q)

Database Systems Concepts 20.28 Silberschatz, Korth and Sudarshan c 1997


' Cursor Stability
$
• Form of degree-two consistency designed for programs written
in general-purpose, record-oriented languages (e.g., Pascal,
C, Cobol, PL/I, Fortran).
• Rather than locking the entire relation, cursor stability ensures
that
– The tuple that is currently being processed by the iteration
is locked in shared mode.
– Any modified tuples are locked in exclusive mode until the
transaction commits.

& %
• Used on heavily accessed relations as a means of increasing
concurrency and improving system performance.
• Use is limited to specialized situations with simple consistency
constraints.

Database Systems Concepts 20.29 Silberschatz, Korth and Sudarshan c 1997


' Transactional Workflows
$
• Workflows are activities that involve the coordinated execution
of multiple tasks performed by different processing entities.
• With the growth of networks, and the existence of multiple
autonomous database systems, workflows provide a
convenient way of carrying out tasks that involve multiple
systems.
• Example of a workflow: delivery of an email message, which
goes through several mailer systems to reach destination.
– Each mailer performs a task: forwarding of the mail to the
next mailer

& %
– If a mailer cannot deliver mail, failure must be handled
semantically (delivery failure message)

• Workflows usually involve humans: e.g. loan processing, or


purchase order processing

Database Systems Concepts 20.30 Silberschatz, Korth and Sudarshan c 1997


' Loan Processing Workflow
$
loan
application
customer loan officer

reject verification

loan superior
disbursement accept officer

• In the past, workflows were handled by creating and forwarding

& %
paper forms
• Computerized workflows aim to automate many of the tasks.
But humans still play a role e.g. in approving loans

Database Systems Concepts 20.31 Silberschatz, Korth and Sudarshan c 1997


' Transactional Workflows
$
• Must address following issues to computerize a workflow.
– Specification of workflows – detailing the tasks that must be
carried out and defining the execution requirements.
– Execution of workflows – execute transactions specified in
the workflow while also providing traditional database
safeguards related to the correctness of computations, data
integrity, and durability.
∗ E.g.: Loan application should not get lost even if system
fails

& %
• Extend transaction concepts to the context of workflows.
• State of a workflow – consists of the collection of states of its
constituent tasks, and the states (i.e., values) of all variables in
the execution plan.

Database Systems Concepts 20.32 Silberschatz, Korth and Sudarshan c 1997


' Workflow Specification
$
• Static specification of task coordination:
– Tasks and dependencies among them are defined before
the execution of the workflow starts.
– Can establish preconditions for execution of each task;
tasks are executed only when their preconditions are
satisfied.
– Define preconditions through dependencies:
∗ Execution states of other tasks.
“task ti cannot start until task tj has ended”
∗ Output values of other tasks.

& %
“task ti can start if task tj returns a value greater than 25”
∗ External variables, that are modified by external events.
“task ti must be started within 24 hours of the completion
of task tj ”

Database Systems Concepts 20.33 Silberschatz, Korth and Sudarshan c 1997


' Workflow Specification (Cont.)
$
• Dynamic task coordination
E.g. Electronic mail routing system in which the next task to be
scheduled for a given mail message depends on the
destination address and on which intermediate routers are
functioning.

&
Database Systems Concepts 20.34 Silberschatz, Korth and Sudarshan c 1997
%
' Failure-Atomicity Requirements of a Workflow
$
• Usual ACID transactional requirements are too
strong/unimplementable for workflow applications
• However, workflows must satisfy some limited transactional
properties that guarantee a process is not left in an
inconsistent state.
• Acceptable termination states – every execution of a workflow
will terminate in a state that satisfies the failure-atomicity
requirements defined by the designer.
– Committed – objectives of a workflow have been achieved.

& %
– Aborted – valid termination state in which a workflow has
failed to achieve its objectives.
• A workflow must reach an acceptable termination state even in
the presence of system failures.

Database Systems Concepts 20.35 Silberschatz, Korth and Sudarshan c 1997


' Execution of Workflows
$
Workflow management systems include:

• Scheduler – program that processes workflows by submitting


various tasks for execution, monitoring various events, and
evaluating conditions related to intertask dependencies

• Task agents – control the execution of a task by a processing


entity

• Mechanism to query the state of the workflow system.

&
Database Systems Concepts 20.36 Silberschatz, Korth and Sudarshan c 1997
%
' Workflow Management System Architectures
$
• Centralized – a single scheduler schedules the tasks for all
concurrently executing workflows.
– used in workflow systems where the data is stored in a
central database
– easier to keep track of the state of a workflow
• Partially distributed – has one (instance of a) scheduler for
each workflow.
• Fully distributed – has no scheduler, but the task agents
coordinate their execution by communicating with each other to

& %
satisfy task dependencies and other workflow execution
requirements.
– used in simplest workflow execution systems
– based on electronic mail

Database Systems Concepts 20.37 Silberschatz, Korth and Sudarshan c 1997


' Workflow Scheduler
$
• Before executing a workflow, the scheduler should determine if
termination in an acceptable state can be guaranteed; if not,
workflow should not be executed.
• Consider a workflow consisting of two tasks S1 and S2 . Let the
failure-atomicity requirement be that either both or neither of
the subtransactions should be committed.
– Suppose systems executing S1 and S2 do not provide
prepared-to-commit states and S1 or S2 do not have
compensating transactions.
– It is then possible to reach a state where one

& %
subtransaction is committed and the other aborted. Both
cannot then be brought to the same state.
– Workflow specification is unsafe, and should be rejected.
• Determination of safety by the scheduler is not possible in
general, and is usually left to the designer of the workflow.

Database Systems Concepts 20.38 Silberschatz, Korth and Sudarshan c 1997


' Recovery of a Workflow
$
• Ensure that if a failure occurs in any of the
workflow-processing components, the workflow eventually
reaches an acceptable termination state.
• Failure-recovery routines need to restore the state information
of the scheduler at the time of failure, including the information
about the execution states of each task.
Log status information on stable storage.
• Handoff of tasks between agents should occur exactly once in
spite of failure.

& %
Problem: Repeating handoff on recovery may lead to duplicate
execution of task; not repeating handoff may lead to task not
being executed.
Solution: Persistent messaging systems

Database Systems Concepts 20.39 Silberschatz, Korth and Sudarshan c 1997


' Recovery of a Workflow (Cont.)
$
• Persistent messages: messages are stored in permanent
message queue and therefore not lost in case of failure.
– Before an agent commits, it writes to the persistent
message queue whatever messages need to be sent out.
– The persistent message system must make sure the
messages get delivered eventually if and only if the
transaction commits.
The message system needs to resend a message when the

& %
site recovers, if the message is not known to have reached
its destination.
– Messages must be logged in stable storage at the receiving
end to detect multiple receipts of a message.

Database Systems Concepts 20.40 Silberschatz, Korth and Sudarshan c 1997

You might also like