0% found this document useful (0 votes)
23 views14 pages

Dbms Unit 5 Part2

Chapter 21 discusses concurrency control techniques to ensure the isolation property of concurrently executing transactions, focusing on protocols that guarantee serializability. It covers two-phase locking protocols, timestamp-based protocols, and multiversion concurrency control, along with issues related to data item granularity and the use of indexes. The chapter emphasizes the importance of locking mechanisms, including binary and shared/exclusive locks, and outlines their operational rules and the need for proper management to avoid problems like deadlock and starvation.

Uploaded by

hannahvincy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views14 pages

Dbms Unit 5 Part2

Chapter 21 discusses concurrency control techniques to ensure the isolation property of concurrently executing transactions, focusing on protocols that guarantee serializability. It covers two-phase locking protocols, timestamp-based protocols, and multiversion concurrency control, along with issues related to data item granularity and the use of indexes. The chapter emphasizes the importance of locking mechanisms, including binary and shared/exclusive locks, and outlines their operational rules and the need for proper management to avoid problems like deadlock and starvation.

Uploaded by

hannahvincy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

chapter 21

Concurrency Control
Techniques

I n this chapter, we discuss a number of concurrency


control techniques that are used to ensure the nonin-
terference or isolation property of concurrently executing transactions. Most of
these techniques ensure serializability of schedules—which we defined in Sec-
tion 21.5—using concurrency control protocols (sets of rules) that guarantee serializ-
ability. One important set of protocols—known as two-phase locking protocols—
employs the technique of locking data items to prevent multiple transactions from
accessing the items concurrently; a number of locking protocols are described in
Sections 21.1 and 21.3.2. Locking protocols are used in some commercial DBMSs,
but they are considered to have high overhead. Another set of concurrency control
protocols uses timestamps. A timestamp is a unique identifier for each transaction,
generated by the system. Timestamp values are generated in the same order as the
transaction start times. Concurrency control protocols that use timestamp ordering
to ensure serializability are introduced in Section 21.2. In Section 21.3, we discuss
multiversion concurrency control protocols that use multiple versions of a data
item. One multiversion protocol extends timestamp order to multiversion time-
stamp ordering (Section 21.3.1), and another extends timestamp order to two-
phase locking (Section 21.3.2). In Section 21.4, we present a protocol based on the
concept of validation or certification of a transaction after it executes its opera-
tions; these are sometimes called optimistic protocols, and they also assume that
multiple versions of a data item can exist. In Section 21.4, we discuss a protocol that
is based on the concept of snapshot isolation, which can utilize techniques similar
to those proposed in validation-based and multiversion methods; these protocols
are used in a number of commercial DBMSs and in certain cases are considered to
have lower overhead than locking-based protocols.

781
782 Chapter 21 Concurrency Control Techniques

Another factor that affects concurrency control is the granularity of the data
items—that is, what portion of the database a data item represents. An item can be
as small as a single attribute (field) value or as large as a disk block, or even a whole
file or the entire database. We discuss granularity of items and a multiple granular-
ity concurrency control protocol, which is an extension of two-phase locking, in
Section 21.5. In Section 21.6, we describe concurrency control issues that arise
when indexes are used to process transactions, and in Section 21.7 we discuss some
additional concurrency control concepts. Section 21.8 summarizes the chapter.
It is sufficient to read Sections 21.1, 21.5, 21.6, and 21.7, and possibly 21.3.2, if your
main interest is an introduction to the concurrency control techniques that are
based on locking.

21.1 Two-Phase Locking Techniques


for Concurrency Control
Some of the main techniques used to control concurrent execution of transactions
are based on the concept of locking data items. A lock is a variable associated with
a data item that describes the status of the item with respect to possible operations
that can be applied to it. Generally, there is one lock for each data item in the data-
base. Locks are used as a means of synchronizing the access by concurrent transac-
tions to the database items. In Section 21.1.1, we discuss the nature and types of
locks. Then, in Section 21.1.2, we present protocols that use locking to guarantee
serializability of transaction schedules. Finally, in Section 21.1.3, we describe two
problems associated with the use of locks—deadlock and starvation—and show
how these problems are handled in concurrency control protocols.

21.1.1 Types of Locks and System Lock Tables


Several types of locks are used in concurrency control. To introduce locking con-
cepts gradually, first we discuss binary locks, which are simple but are also too
restrictive for database concurrency control purposes and so are not used much.
Then we discuss shared/exclusive locks—also known as read/write locks—which
provide more general locking capabilities and are used in database locking schemes.
In Section 21.3.2, we describe an additional type of lock called a certify lock, and we
show how it can be used to improve performance of locking protocols.

Binary Locks. A binary lock can have two states or values: locked and unlocked
(or 1 and 0, for simplicity). A distinct lock is associated with each database item X.
If the value of the lock on X is 1, item X cannot be accessed by a database operation
that requests the item. If the value of the lock on X is 0, the item can be accessed
when requested, and the lock value is changed to 1. We refer to the current value
(or state) of the lock associated with item X as lock(X).
Two operations, lock_item and unlock_item, are used with binary locking. A trans-
action requests access to an item X by first issuing a lock_item(X) operation. If
21.1 Two-Phase Locking Techniques for Concurrency Control 783

lock_item(X):
B: if LOCK(X) = 0 (*item is unlocked*)
then LOCK(X ) ←1 (*lock the item*)
else
begin
wait (until LOCK(X ) = 0
and the lock manager wakes up the transaction);
go to B
end;
unlock_item(X ):
LOCK(X ) ← 0; (* unlock the item *) Figure 21.1
if any transactions are waiting Lock and unlock operations
then wakeup one of the waiting transactions; for binary locks.

LOCK(X) = 1, the transaction is forced to wait. If LOCK(X) = 0, it is set to 1 (the


transaction locks the item) and the transaction is allowed to access item X. When
the transaction is through using the item, it issues an unlock_item(X) operation,
which sets LOCK(X) back to 0 (unlocks the item) so that X may be accessed by
other transactions. Hence, a binary lock enforces mutual exclusion on the data
item. A description of the lock_item(X) and unlock_item(X) operations is shown in
Figure 21.1.
Notice that the lock_item and unlock_item operations must be implemented as indi-
visible units (known as critical sections in operating systems); that is, no interleav-
ing should be allowed once a lock or unlock operation is started until the operation
terminates or the transaction waits. In Figure 21.1, the wait command within the
lock_item(X) operation is usually implemented by putting the transaction in a wait-
ing queue for item X until X is unlocked and the transaction can be granted access
to it. Other transactions that also want to access X are placed in the same queue.
Hence, the wait command is considered to be outside the lock_item operation.
It is simple to implement a binary lock; all that is needed is a binary-valued variable,
LOCK, associated with each data item X in the database. In its simplest form, each
lock can be a record with three fields: <Data_item_name, LOCK, Locking_transaction>
plus a queue for transactions that are waiting to access the item. The system needs
to maintain only these records for the items that are currently locked in a lock table,
which could be organized as a hash file on the item name. Items not in the lock
table are considered to be unlocked. The DBMS has a lock manager subsystem to
keep track of and control access to locks.
If the simple binary locking scheme described here is used, every transaction must
obey the following rules:
1. A transaction T must issue the operation lock_item (X) before any
read_item(X) or write_item(X) operations are performed in T.
2. A transaction T must issue the operation unlock_item(X) after all read_item(X)
and write_item(X) operations are completed in T.
784 Chapter 21 Concurrency Control Techniques

3. A transaction T will not issue a lock_item(X) operation if it already holds the


lock on item X.1
4. A transaction T will not issue an unlock_item(X) operation unless it already
holds the lock on item X.
These rules can be enforced by the lock manager module of the DBMS. Between the
lock_item(X) and unlock_item(X) operations in transaction T, T is said to hold the
lock on item X. At most one transaction can hold the lock on a particular item.
Thus no two transactions can access the same item concurrently.

Shared/Exclusive (or Read/Write) Locks. The preceding binary locking


scheme is too restrictive for database items because at most one transaction can
hold a lock on a given item. We should allow several transactions to access the
same item X if they all access X for reading purposes only. This is because read
operations on the same item by different transactions are not conflicting (see Sec-
tion 21.4.1). However, if a transaction is to write an item X, it must have exclusive
access to X. For this purpose, a different type of lock, called a multiple-mode
lock, is used. In this scheme—called shared/exclusive or read/write locks—there
are three locking operations: read_lock(X), write_lock(X), and unlock(X). A lock
associated with an item X, LOCK(X), now has three possible states: read-locked,
write-locked, or unlocked. A read-locked item is also called share-locked because
other transactions are allowed to read the item, whereas a write-locked item is
called exclusive-locked because a single transaction exclusively holds the lock on
the item.
One method for implementing the preceding operations on a read/write lock is
to keep track of the number of transactions that hold a shared (read) lock on an
item in the lock table, as well as a list of transaction ids that hold a shared lock.
Each record in the lock table will have four fields: <Data_item_name, LOCK,
No_of_reads, Locking_transaction(s)>. The system needs to maintain lock records
only for locked items in the lock table. The value (state) of LOCK is either read-
locked or write-locked, suitably coded (if we assume no records are kept in
the lock table for unlocked items). If LOCK(X) = write-locked, the value of
locking_transaction(s) is a single transaction that holds the exclusive (write) lock
on X. If LOCK(X)=read-locked, the value of locking transaction(s) is a list of one
or more transactions that hold the shared (read) lock on X. The three operations
read_lock(X), write_lock(X), and unlock(X) are described in Figure 21.2.2 As before,
each of the three locking operations should be considered indivisible; no inter-
leaving should be allowed once one of the operations is started until either the
operation terminates by granting the lock or the transaction is placed in a wait-
ing queue for the item.

1
This rule may be removed if we modify the lock_item (X) operation in Figure 21.1 so that if the item is
currently locked by the requesting transaction, the lock is granted.
2
These algorithms do not allow upgrading or downgrading of locks, as described later in this section. The
reader can extend the algorithms to allow these additional operations.
21.1 Two-Phase Locking Techniques for Concurrency Control 785

read_lock(X ):
B: if LOCK(X) = “unlocked”
then begin LOCK(X) ← “read-locked”;
no_of_reads(X) ← 1
end
else if LOCK(X) = “read-locked”
then no_of_reads(X) ← no_of_reads(X) + 1
else begin
wait (until LOCK(X) = “unlocked”
and the lock manager wakes up the transaction);
go to B
end;
write_lock(X ):
B: if LOCK(X) = “unlocked”
then LOCK(X) ← “write-locked”
else begin
wait (until LOCK(X) = “unlocked”
and the lock manager wakes up the transaction);
go to B
end;
unlock (X ):
if LOCK(X) = “write-locked”
then begin LOCK(X) ← “unlocked”;
wakeup one of the waiting transactions, if any
end
else it LOCK(X) = “read-locked”
then begin
no_of_reads(X) ← no_of_reads(X) −1;
Figure 21.2
if no_of_reads(X) = 0 Locking and unlocking
then begin LOCK(X) = “unlocked”; operations for two-
wakeup one of the waiting transactions, if any mode (read/write, or
end shared/exclusive)
end; locks.

When we use the shared/exclusive locking scheme, the system must enforce the
following rules:
1. A transaction T must issue the operation read_lock(X) or write_lock(X) before
any read_item(X) operation is performed in T.
2. A transaction T must issue the operation write_lock(X) before any write_item(X)
operation is performed in T.
3. A transaction T must issue the operation unlock(X) after all read_item(X) and
write_item(X) operations are completed in T.3

3
This rule may be relaxed to allow a transaction to unlock an item, then lock it again later. However, two-
phase locking does not allow this.
786 Chapter 21 Concurrency Control Techniques

4. A transaction T will not issue a read_lock(X) operation if it already holds a


read (shared) lock or a write (exclusive) lock on item X. This rule may be
relaxed for downgrading of locks, as we discuss shortly.
5. A transaction T will not issue a write_lock(X) operation if it already holds a
read (shared) lock or write (exclusive) lock on item X. This rule may also be
relaxed for upgrading of locks, as we discuss shortly.
6. A transaction T will not issue an unlock(X) operation unless it already holds
a read (shared) lock or a write (exclusive) lock on item X.

Conversion (Upgrading, Downgrading) of Locks. It is desirable to relax con-


ditions 4 and 5 in the preceding list in order to allow lock conversion; that is, a
transaction that already holds a lock on item X is allowed under certain conditions
to convert the lock from one locked state to another. For example, it is possible for
a transaction T to issue a read_lock(X) and then later to upgrade the lock by issuing
a write_lock(X) operation. If T is the only transaction holding a read lock on X at the
time it issues the write_lock(X) operation, the lock can be upgraded; otherwise, the
transaction must wait. It is also possible for a transaction T to issue a write_lock(X)
and then later to downgrade the lock by issuing a read_lock(X) operation. When
upgrading and downgrading of locks is used, the lock table must include transac-
tion identifiers in the record structure for each lock (in the locking_transaction(s)
field) to store the information on which transactions hold locks on the item. The
descriptions of the read_lock(X) and write_lock(X) operations in Figure 21.2 must be
changed appropriately to allow for lock upgrading and downgrading. We leave this
as an exercise for the reader.
Using binary locks or read/write locks in transactions, as described earlier, does not
guarantee serializability of schedules on its own. Figure 21.3 shows an example
where the preceding locking rules are followed but a nonserializable schedule may
result. This is because in Figure 21.3(a) the items Y in T1 and X in T2 were unlocked
too early. This allows a schedule such as the one shown in Figure 21.3(c) to occur,
which is not a serializable schedule and hence gives incorrect results. To guarantee
serializability, we must follow an additional protocol concerning the positioning of
locking and unlocking operations in every transaction. The best-known protocol,
two-phase locking, is described in the next section.

21.1.2 Guaranteeing Serializability by Two-Phase Locking


A transaction is said to follow the two-phase locking protocol if all locking opera-
tions (read_lock, write_lock) precede the first unlock operation in the transaction.4
Such a transaction can be divided into two phases: an expanding or growing
(first) phase, during which new locks on items can be acquired but none can be
released; and a shrinking (second) phase, during which existing locks can be
released but no new locks can be acquired. If lock conversion is allowed, then
upgrading of locks (from read-locked to write-locked) must be done during the
4
This is unrelated to the two-phase commit protocol for recovery in distributed databases (see Chapter 23).
21.1 Two-Phase Locking Techniques for Concurrency Control 787

(a) T1 T2 (b) Initial values: X=20, Y=30

read_lock(Y ); read_lock(X ); Result serial schedule T1


read_item(Y ); read_item(X ); followed by T2: X=50, Y=80
unlock(Y ); unlock(X );
write_lock(X ); write_lock(Y ); Result of serial schedule T2
read_item(X ); read_item(Y ); followed by T1: X=70, Y=50
X := X + Y; Y := X + Y;
write_item(X ); write_item(Y );
unlock(X ); unlock(Y );

(c) T1 T2

read_lock(Y );
read_item(Y );
unlock(Y );
read_lock(X ); Result of schedule S:
read_item(X ); X=50, Y=50
unlock(X ); (nonserializable)
Time write_lock(Y );
read_item(Y );
Y := X + Y;
write_item(Y );
unlock(Y );
write_lock(X );
read_item(X ); Figure 21.3
X := X + Y; Transactions that do not obey two-phase locking.
write_item(X ); (a) Two transactions T1 and T2. (b) Results of
unlock(X ); possible serial schedules of T1 and T2. (c) A
nonserializable schedule S that uses locks.

expanding phase, and downgrading of locks (from write-locked to read-locked)


must be done in the shrinking phase.
Transactions T1 and T2 in Figure 21.3(a) do not follow the two-phase locking pro-
tocol because the write_lock(X) operation follows the unlock(Y) operation in T1, and
similarly the write_lock(Y) operation follows the unlock(X) operation in T2. If we
enforce two-phase locking, the transactions can be rewritten as T1′ and T2′, as
shown in Figure 21.4. Now, the schedule shown in Figure 21.3(c) is not permitted
for T1′ and T2′ (with their modified order of locking and unlocking operations)
under the rules of locking described in Section 21.1.1 because T1′ will issue its
write_lock(X) before it unlocks item Y; consequently, when T2′ issues its read_lock(X),
it is forced to wait until T1′ releases the lock by issuing an unlock (X) in the schedule.
However, this can lead to deadlock (see Section 21.1.3).
788 Chapter 21 Concurrency Control Techniques

T1⬘ T 2⬘

read_lock(Y ); read_lock(X );
read_item(Y ); read_item(X );
write_lock(X ); write_lock(Y );
unlock(Y ) unlock(X )
Figure 21.4 read_item(X ); read_item(Y );
Transactions T1′ and T2′, which are the X := X + Y; Y := X + Y;
same as T1 and T2 in Figure 21.3 but write_item(X ); write_item(Y );
follow the two-phase locking protocol. unlock(X ); unlock(Y );
Note that they can produce a deadlock.

It can be proved that, if every transaction in a schedule follows the two-phase lock-
ing protocol, the schedule is guaranteed to be serializable, obviating the need to test
for serializability of schedules. The locking protocol, by enforcing two-phase lock-
ing rules, also enforces serializability.
Two-phase locking may limit the amount of concurrency that can occur in a sched-
ule because a transaction T may not be able to release an item X after it is through
using it if T must lock an additional item Y later; or, conversely, T must lock the
additional item Y before it needs it so that it can release X. Hence, X must remain
locked by T until all items that the transaction needs to read or write have been
locked; only then can X be released by T. Meanwhile, another transaction seeking to
access X may be forced to wait, even though T is done with X; conversely, if Y is
locked earlier than it is needed, another transaction seeking to access Y is forced to
wait even though T is not using Y yet. This is the price for guaranteeing serializabil-
ity of all schedules without having to check the schedules themselves.
Although the two-phase locking protocol guarantees serializability (that is, every
schedule that is permitted is serializable), it does not permit all possible serializable
schedules (that is, some serializable schedules will be prohibited by the protocol).

Basic, Conservative, Strict, and Rigorous Two-Phase Locking. There are a


number of variations of two-phase locking (2PL). The technique just described is
known as basic 2PL. A variation known as conservative 2PL (or static 2PL)
requires a transaction to lock all the items it accesses before the transaction begins
execution, by predeclaring its read-set and write-set. Recall from Section 21.1.2 that
the read-set of a transaction is the set of all items that the transaction reads, and the
write-set is the set of all items that it writes. If any of the predeclared items needed
cannot be locked, the transaction does not lock any item; instead, it waits until all
the items are available for locking. Conservative 2PL is a deadlock-free protocol, as
we will see in Section 21.1.3 when we discuss the deadlock problem. However, it is
difficult to use in practice because of the need to predeclare the read-set and write-
set, which is not possible in some situations.
In practice, the most popular variation of 2PL is strict 2PL, which guarantees strict
schedules (see Section 21.4). In this variation, a transaction T does not release any
21.1 Two-Phase Locking Techniques for Concurrency Control 789

of its exclusive (write) locks until after it commits or aborts. Hence, no other trans-
action can read or write an item that is written by T unless T has committed, lead-
ing to a strict schedule for recoverability. Strict 2PL is not deadlock-free. A more
restrictive variation of strict 2PL is rigorous 2PL, which also guarantees strict
schedules. In this variation, a transaction T does not release any of its locks (exclu-
sive or shared) until after it commits or aborts, and so it is easier to implement
than strict 2PL.
Notice the difference between strict and rigorous 2PL: the former holds write-locks
until it commits, whereas the latter holds all locks (read and write). Also, the differ-
ence between conservative and rigorous 2PL is that the former must lock all its
items before it starts, so once the transaction starts it is in its shrinking phase; the
latter does not unlock any of its items until after it terminates (by committing or
aborting), so the transaction is in its expanding phase until it ends.
Usually the concurrency control subsystem itself is responsible for generating
the read_lock and write_lock requests. For example, suppose the system is to enforce
the strict 2PL protocol. Then, whenever transaction T issues a read_item(X), the
system calls the read_lock(X) operation on behalf of T. If the state of LOCK(X) is
write_locked by some other transaction T′, the system places T in the waiting queue
for item X; otherwise, it grants the read_lock(X) request and permits the read_item(X)
operation of T to execute. On the other hand, if transaction T issues a write_item(X),
the system calls the write_lock(X) operation on behalf of T. If the state of LOCK(X) is
write_locked or read_locked by some other transaction T′, the system places T in
the waiting queue for item X; if the state of LOCK(X) is read_locked and T itself is
the only transaction holding the read lock on X, the system upgrades the lock to
write_locked and permits the write_item(X) operation by T. Finally, if the state of
LOCK(X) is unlocked, the system grants the write_lock(X) request and permits the
write_item(X) operation to execute. After each action, the system must update its
lock table appropriately.
Locking is generally considered to have a high overhead, because every read or
write operation is preceded by a system locking request. The use of locks can also
cause two additional problems: deadlock and starvation. We discuss these problems
and their solutions in the next section.

21.1.3 Dealing with Deadlock and Starvation


Deadlock occurs when each transaction T in a set of two or more transactions is
waiting for some item that is locked by some other transaction T′ in the set. Hence,
each transaction in the set is in a waiting queue, waiting for one of the other trans-
actions in the set to release the lock on an item. But because the other transaction is
also waiting, it will never release the lock. A simple example is shown in Fig-
ure 21.5(a), where the two transactions T1′ and T2′ are deadlocked in a partial
schedule; T1′ is in the waiting queue for X, which is locked by T2′, whereas T2′ is in
the waiting queue for Y, which is locked by T1′. Meanwhile, neither T1′ nor T2′ nor
any other transaction can access items X and Y.
790 Chapter 21 Concurrency Control Techniques

(a) T1⬘ T2⬘ (b) X

read_lock(Y );
read_item(Y ); T1⬘ T2⬘
read_lock(X );
Time read_item(X ); Y
write_lock(X );
write_lock(Y );

Figure 21.5
Illustrating the deadlock problem. (a) A partial schedule of T1′ and T2′ that is
in a state of deadlock. (b) A wait-for graph for the partial schedule in (a).

Deadlock Prevention Protocols. One way to prevent deadlock is to use a deadlock


prevention protocol.5 One deadlock prevention protocol, which is used in conserva-
tive two-phase locking, requires that every transaction lock all the items it needs in
advance (which is generally not a practical assumption)—if any of the items cannot be
obtained, none of the items are locked. Rather, the transaction waits and then tries
again to lock all the items it needs. Obviously, this solution further limits concurrency.
A second protocol, which also limits concurrency, involves ordering all the items in the
database and making sure that a transaction that needs several items will lock them
according to that order. This requires that the programmer (or the system) is aware of
the chosen order of the items, which is also not practical in the database context.
A number of other deadlock prevention schemes have been proposed that make a
decision about what to do with a transaction involved in a possible deadlock situation:
Should it be blocked and made to wait or should it be aborted, or should the transac-
tion preempt and abort another transaction? Some of these techniques use the concept
of transaction timestamp TS(T′), which is a unique identifier assigned to each trans-
action. The timestamps are typically based on the order in which transactions are
started; hence, if transaction T1 starts before transaction T2, then TS(T1) < TS(T2).
Notice that the older transaction (which starts first) has the smaller timestamp value.
Two schemes that prevent deadlock are called wait-die and wound-wait. Suppose that
transaction Ti tries to lock an item X but is not able to because X is locked by some
other transaction Tj with a conflicting lock. The rules followed by these schemes are:
■ Wait-die. If TS(Ti) < TS(Tj), then (Ti older than Tj) Ti is allowed to wait;
otherwise (Ti younger than Tj) abort Ti (Ti dies) and restart it later with the
same timestamp.
■ Wound-wait. If TS(Ti) < TS(Tj), then (Ti older than Tj) abort Tj (Ti wounds
Tj) and restart it later with the same timestamp; otherwise (Ti younger than
Tj) Ti is allowed to wait.

5
These protocols are not generally used in practice, either because of unrealistic assumptions or
because of their possible overhead. Deadlock detection and timeouts (covered in the following sections)
are more practical.
21.1 Two-Phase Locking Techniques for Concurrency Control 791

In wait-die, an older transaction is allowed to wait for a younger transaction, whereas


a younger transaction requesting an item held by an older transaction is aborted and
restarted. The wound-wait approach does the opposite: A younger transaction is
allowed to wait for an older one, whereas an older transaction requesting an item held
by a younger transaction preempts the younger transaction by aborting it. Both
schemes end up aborting the younger of the two transactions (the transaction that
started later) that may be involved in a deadlock, assuming that this will waste less
processing. It can be shown that these two techniques are deadlock-free, since in wait-
die, transactions only wait for younger transactions so no cycle is created. Similarly, in
wound-wait, transactions only wait for older transactions so no cycle is created. How-
ever, both techniques may cause some transactions to be aborted and restarted need-
lessly, even though those transactions may never actually cause a deadlock.
Another group of protocols that prevent deadlock do not require timestamps.
These include the no waiting (NW) and cautious waiting (CW) algorithms. In the
no waiting algorithm, if a transaction is unable to obtain a lock, it is immediately
aborted and then restarted after a certain time delay without checking whether a
deadlock will actually occur or not. In this case, no transaction ever waits, so no
deadlock will occur. However, this scheme can cause transactions to abort and
restart needlessly. The cautious waiting algorithm was proposed to try to reduce
the number of needless aborts/restarts. Suppose that transaction Ti tries to lock an
item X but is not able to do so because X is locked by some other transaction Tj with
a conflicting lock. The cautious waiting rule is as follows:
■ Cautious waiting. If Tj is not blocked (not waiting for some other locked
item), then Ti is blocked and allowed to wait; otherwise abort Ti.
It can be shown that cautious waiting is deadlock-free, because no transaction will
ever wait for another blocked transaction. By considering the time b(T) at which
each blocked transaction T was blocked, if the two transactions Ti and Tj above both
become blocked and Ti is waiting for Tj, then b(Ti) < b(Tj), since Ti can only wait for
Tj at a time when Tj is not blocked itself. Hence, the blocking times form a total
ordering on all blocked transactions, so no cycle that causes deadlock can occur.

Deadlock Detection. An alternative approach to dealing with deadlock is


deadlock detection, where the system checks if a state of deadlock actually exists.
This solution is attractive if we know there will be little interference among the
transactions—that is, if different transactions will rarely access the same items at
the same time. This can happen if the transactions are short and each transaction
locks only a few items, or if the transaction load is light. On the other hand, if trans-
actions are long and each transaction uses many items, or if the transaction load is
heavy, it may be advantageous to use a deadlock prevention scheme.
A simple way to detect a state of deadlock is for the system to construct and main-
tain a wait-for graph. One node is created in the wait-for graph for each transac-
tion that is currently executing. Whenever a transaction Ti is waiting to lock an
item X that is currently locked by a transaction Tj, a directed edge (Ti → Tj) is cre-
ated in the wait-for graph. When Tj releases the lock(s) on the items that Ti was
792 Chapter 21 Concurrency Control Techniques

waiting for, the directed edge is dropped from the wait-for graph. We have a state of
deadlock if and only if the wait-for graph has a cycle. One problem with this
approach is the matter of determining when the system should check for a dead-
lock. One possibility is to check for a cycle every time an edge is added to the wait-
for graph, but this may cause excessive overhead. Criteria such as the number of
currently executing transactions or the period of time several transactions have
been waiting to lock items may be used instead to check for a cycle. Figure 21.5(b)
shows the wait-for graph for the (partial) schedule shown in Figure 21.5(a).
If the system is in a state of deadlock, some of the transactions causing the deadlock
must be aborted. Choosing which transactions to abort is known as victim
selection. The algorithm for victim selection should generally avoid selecting trans-
actions that have been running for a long time and that have performed many
updates, and it should try instead to select transactions that have not made many
changes (younger transactions).

Timeouts. Another simple scheme to deal with deadlock is the use of timeouts.
This method is practical because of its low overhead and simplicity. In this method,
if a transaction waits for a period longer than a system-defined timeout period, the
system assumes that the transaction may be deadlocked and aborts it—regardless of
whether a deadlock actually exists.

Starvation. Another problem that may occur when we use locking is starvation,
which occurs when a transaction cannot proceed for an indefinite period of time
while other transactions in the system continue normally. This may occur if the
waiting scheme for locked items is unfair in that it gives priority to some transac-
tions over others. One solution for starvation is to have a fair waiting scheme, such
as using a first-come-first-served queue; transactions are enabled to lock an item
in the order in which they originally requested the lock. Another scheme allows
some transactions to have priority over others but increases the priority of a trans-
action the longer it waits, until it eventually gets the highest priority and proceeds.
Starvation can also occur because of victim selection if the algorithm selects the
same transaction as victim repeatedly, thus causing it to abort and never finish exe-
cution. The algorithm can use higher priorities for transactions that have been
aborted multiple times to avoid this problem. The wait-die and wound-wait
schemes discussed previously avoid starvation, because they restart a transaction
that has been aborted with its same original timestamp, so the possibility that the
same transaction is aborted repeatedly is slim.

21.2 Concurrency Control Based


on Timestamp Ordering
The use of locking, combined with the 2PL protocol, guarantees serializability of
schedules. The serializable schedules produced by 2PL have their equivalent serial
schedules based on the order in which executing transactions lock the items they
acquire. If a transaction needs an item that is already locked, it may be forced to
wait until the item is released. Some transactions may be aborted and restarted
21.2 Concurrency Control Based on Timestamp Ordering 793

because of the deadlock problem. A different approach to concurrency control


involves using transaction timestamps to order transaction execution for an equiv-
alent serial schedule. In Section 21.2.1, we discuss timestamps; and in Section 21.2.2,
we discuss how serializability is enforced by ordering conflicting operations in dif-
ferent transactions based on the transaction timestamps.

21.2.1 Timestamps
Recall that a timestamp is a unique identifier created by the DBMS to identify a
transaction. Typically, timestamp values are assigned in the order in which the
transactions are submitted to the system, so a timestamp can be thought of as the
transaction start time. We will refer to the timestamp of transaction T as TS(T).
Concurrency control techniques based on timestamp ordering do not use locks;
hence, deadlocks cannot occur.
Timestamps can be generated in several ways. One possibility is to use a counter that
is incremented each time its value is assigned to a transaction. The transaction time-
stamps are numbered 1, 2, 3, … in this scheme. A computer counter has a finite
maximum value, so the system must periodically reset the counter to zero when no
transactions are executing for some short period of time. Another way to implement
timestamps is to use the current date/time value of the system clock and ensure that
no two timestamp values are generated during the same tick of the clock.

21.2.2 The Timestamp Ordering Algorithm


for Concurrency Control
The idea for this scheme is to enforce the equivalent serial order on the transac-
tions based on their timestamps. A schedule in which the transactions participate
is then serializable, and the only equivalent serial schedule permitted has the trans-
actions in order of their timestamp values. This is called timestamp ordering
(TO). Notice how this differs from 2PL, where a schedule is serializable by being
equivalent to some serial schedule allowed by the locking protocols. In timestamp
ordering, however, the schedule is equivalent to the particular serial order corre-
sponding to the order of the transaction timestamps. The algorithm allows inter-
leaving of transaction operations, but it must ensure that for each pair of conflicting
operations in the schedule, the order in which the item is accessed must follow the
timestamp order. To do this, the algorithm associates with each database item X
two timestamp (TS) values:
1. read_TS(X). The read timestamp of item X is the largest timestamp
among all the timestamps of transactions that have successfully read item
X—that is, read_TS(X) = TS(T), where T is the youngest transaction that
has read X successfully.
2. write_TS(X). The write timestamp of item X is the largest of all the time-
stamps of transactions that have successfully written item X—that is,
write_TS(X) = TS(T), where T is the youngest transaction that has written
X successfully. Based on the algorithm, T will also be the last transaction
to write item X, as we shall see.
794 Chapter 21 Concurrency Control Techniques

Basic Timestamp Ordering (TO). Whenever some transaction T tries to issue a


read_item(X) or a write_item(X) operation, the basic TO algorithm compares the
timestamp of T with read_TS(X) and write_TS(X) to ensure that the timestamp order
of transaction execution is not violated. If this order is violated, then transaction T
is aborted and resubmitted to the system as a new transaction with a new time-
stamp. If T is aborted and rolled back, any transaction T1 that may have used a value
written by T must also be rolled back. Similarly, any transaction T2 that may have
used a value written by T1 must also be rolled back, and so on. This effect is known
as cascading rollback and is one of the problems associated with basic TO, since
the schedules produced are not guaranteed to be recoverable. An additional proto-
col must be enforced to ensure that the schedules are recoverable, cascadeless, or
strict. We first describe the basic TO algorithm here. The concurrency control algo-
rithm must check whether conflicting operations violate the timestamp ordering in
the following two cases:
1. Whenever a transaction T issues a write_item(X) operation, the following
check is performed:
a. If read_TS(X) > TS(T) or if write_TS(X) > TS(T), then abort and roll back T
and reject the operation. This should be done because some younger trans-
action with a timestamp greater than TS(T)—and hence after T in the
timestamp ordering—has already read or written the value of item X
before T had a chance to write X, thus violating the timestamp ordering.
b. If the condition in part (a) does not occur, then execute the write_item(X)
operation of T and set write_TS(X) to TS(T).
2. Whenever a transaction T issues a read_item(X) operation, the following
check is performed:
a. If write_TS(X) > TS(T), then abort and roll back T and reject the operation.
This should be done because some younger transaction with timestamp
greater than TS(T)—and hence after T in the timestamp ordering—has
already written the value of item X before T had a chance to read X.
b. If write_TS(X) ≤ TS(T), then execute the read_item(X) operation of T and
set read_TS(X) to the larger of TS(T) and the current read_TS(X).
Whenever the basic TO algorithm detects two conflicting operations that occur in
the incorrect order, it rejects the later of the two operations by aborting the transac-
tion that issued it. The schedules produced by basic TO are hence guaranteed to be
conflict serializable. As mentioned earlier, deadlock does not occur with timestamp
ordering. However, cyclic restart (and hence starvation) may occur if a transaction
is continually aborted and restarted.

Strict Timestamp Ordering (TO). A variation of basic TO called strict TO ensures


that the schedules are both strict (for easy recoverability) and (conflict) serializable.
In this variation, a transaction T issues a read_item(X) or write_item(X) such that
TS(T) > write_TS(X) has its read or write operation delayed until the transaction T′
that wrote the value of X (hence TS(T′) = write_TS(X)) has committed or aborted.

You might also like