Dbms Unit 5 Part2
Dbms Unit 5 Part2
Concurrency Control
Techniques
781
782 Chapter 21 Concurrency Control Techniques
Another factor that affects concurrency control is the granularity of the data
items—that is, what portion of the database a data item represents. An item can be
as small as a single attribute (field) value or as large as a disk block, or even a whole
file or the entire database. We discuss granularity of items and a multiple granular-
ity concurrency control protocol, which is an extension of two-phase locking, in
Section 21.5. In Section 21.6, we describe concurrency control issues that arise
when indexes are used to process transactions, and in Section 21.7 we discuss some
additional concurrency control concepts. Section 21.8 summarizes the chapter.
It is sufficient to read Sections 21.1, 21.5, 21.6, and 21.7, and possibly 21.3.2, if your
main interest is an introduction to the concurrency control techniques that are
based on locking.
Binary Locks. A binary lock can have two states or values: locked and unlocked
(or 1 and 0, for simplicity). A distinct lock is associated with each database item X.
If the value of the lock on X is 1, item X cannot be accessed by a database operation
that requests the item. If the value of the lock on X is 0, the item can be accessed
when requested, and the lock value is changed to 1. We refer to the current value
(or state) of the lock associated with item X as lock(X).
Two operations, lock_item and unlock_item, are used with binary locking. A trans-
action requests access to an item X by first issuing a lock_item(X) operation. If
21.1 Two-Phase Locking Techniques for Concurrency Control 783
lock_item(X):
B: if LOCK(X) = 0 (*item is unlocked*)
then LOCK(X ) ←1 (*lock the item*)
else
begin
wait (until LOCK(X ) = 0
and the lock manager wakes up the transaction);
go to B
end;
unlock_item(X ):
LOCK(X ) ← 0; (* unlock the item *) Figure 21.1
if any transactions are waiting Lock and unlock operations
then wakeup one of the waiting transactions; for binary locks.
1
This rule may be removed if we modify the lock_item (X) operation in Figure 21.1 so that if the item is
currently locked by the requesting transaction, the lock is granted.
2
These algorithms do not allow upgrading or downgrading of locks, as described later in this section. The
reader can extend the algorithms to allow these additional operations.
21.1 Two-Phase Locking Techniques for Concurrency Control 785
read_lock(X ):
B: if LOCK(X) = “unlocked”
then begin LOCK(X) ← “read-locked”;
no_of_reads(X) ← 1
end
else if LOCK(X) = “read-locked”
then no_of_reads(X) ← no_of_reads(X) + 1
else begin
wait (until LOCK(X) = “unlocked”
and the lock manager wakes up the transaction);
go to B
end;
write_lock(X ):
B: if LOCK(X) = “unlocked”
then LOCK(X) ← “write-locked”
else begin
wait (until LOCK(X) = “unlocked”
and the lock manager wakes up the transaction);
go to B
end;
unlock (X ):
if LOCK(X) = “write-locked”
then begin LOCK(X) ← “unlocked”;
wakeup one of the waiting transactions, if any
end
else it LOCK(X) = “read-locked”
then begin
no_of_reads(X) ← no_of_reads(X) −1;
Figure 21.2
if no_of_reads(X) = 0 Locking and unlocking
then begin LOCK(X) = “unlocked”; operations for two-
wakeup one of the waiting transactions, if any mode (read/write, or
end shared/exclusive)
end; locks.
When we use the shared/exclusive locking scheme, the system must enforce the
following rules:
1. A transaction T must issue the operation read_lock(X) or write_lock(X) before
any read_item(X) operation is performed in T.
2. A transaction T must issue the operation write_lock(X) before any write_item(X)
operation is performed in T.
3. A transaction T must issue the operation unlock(X) after all read_item(X) and
write_item(X) operations are completed in T.3
3
This rule may be relaxed to allow a transaction to unlock an item, then lock it again later. However, two-
phase locking does not allow this.
786 Chapter 21 Concurrency Control Techniques
(c) T1 T2
read_lock(Y );
read_item(Y );
unlock(Y );
read_lock(X ); Result of schedule S:
read_item(X ); X=50, Y=50
unlock(X ); (nonserializable)
Time write_lock(Y );
read_item(Y );
Y := X + Y;
write_item(Y );
unlock(Y );
write_lock(X );
read_item(X ); Figure 21.3
X := X + Y; Transactions that do not obey two-phase locking.
write_item(X ); (a) Two transactions T1 and T2. (b) Results of
unlock(X ); possible serial schedules of T1 and T2. (c) A
nonserializable schedule S that uses locks.
T1⬘ T 2⬘
read_lock(Y ); read_lock(X );
read_item(Y ); read_item(X );
write_lock(X ); write_lock(Y );
unlock(Y ) unlock(X )
Figure 21.4 read_item(X ); read_item(Y );
Transactions T1′ and T2′, which are the X := X + Y; Y := X + Y;
same as T1 and T2 in Figure 21.3 but write_item(X ); write_item(Y );
follow the two-phase locking protocol. unlock(X ); unlock(Y );
Note that they can produce a deadlock.
It can be proved that, if every transaction in a schedule follows the two-phase lock-
ing protocol, the schedule is guaranteed to be serializable, obviating the need to test
for serializability of schedules. The locking protocol, by enforcing two-phase lock-
ing rules, also enforces serializability.
Two-phase locking may limit the amount of concurrency that can occur in a sched-
ule because a transaction T may not be able to release an item X after it is through
using it if T must lock an additional item Y later; or, conversely, T must lock the
additional item Y before it needs it so that it can release X. Hence, X must remain
locked by T until all items that the transaction needs to read or write have been
locked; only then can X be released by T. Meanwhile, another transaction seeking to
access X may be forced to wait, even though T is done with X; conversely, if Y is
locked earlier than it is needed, another transaction seeking to access Y is forced to
wait even though T is not using Y yet. This is the price for guaranteeing serializabil-
ity of all schedules without having to check the schedules themselves.
Although the two-phase locking protocol guarantees serializability (that is, every
schedule that is permitted is serializable), it does not permit all possible serializable
schedules (that is, some serializable schedules will be prohibited by the protocol).
of its exclusive (write) locks until after it commits or aborts. Hence, no other trans-
action can read or write an item that is written by T unless T has committed, lead-
ing to a strict schedule for recoverability. Strict 2PL is not deadlock-free. A more
restrictive variation of strict 2PL is rigorous 2PL, which also guarantees strict
schedules. In this variation, a transaction T does not release any of its locks (exclu-
sive or shared) until after it commits or aborts, and so it is easier to implement
than strict 2PL.
Notice the difference between strict and rigorous 2PL: the former holds write-locks
until it commits, whereas the latter holds all locks (read and write). Also, the differ-
ence between conservative and rigorous 2PL is that the former must lock all its
items before it starts, so once the transaction starts it is in its shrinking phase; the
latter does not unlock any of its items until after it terminates (by committing or
aborting), so the transaction is in its expanding phase until it ends.
Usually the concurrency control subsystem itself is responsible for generating
the read_lock and write_lock requests. For example, suppose the system is to enforce
the strict 2PL protocol. Then, whenever transaction T issues a read_item(X), the
system calls the read_lock(X) operation on behalf of T. If the state of LOCK(X) is
write_locked by some other transaction T′, the system places T in the waiting queue
for item X; otherwise, it grants the read_lock(X) request and permits the read_item(X)
operation of T to execute. On the other hand, if transaction T issues a write_item(X),
the system calls the write_lock(X) operation on behalf of T. If the state of LOCK(X) is
write_locked or read_locked by some other transaction T′, the system places T in
the waiting queue for item X; if the state of LOCK(X) is read_locked and T itself is
the only transaction holding the read lock on X, the system upgrades the lock to
write_locked and permits the write_item(X) operation by T. Finally, if the state of
LOCK(X) is unlocked, the system grants the write_lock(X) request and permits the
write_item(X) operation to execute. After each action, the system must update its
lock table appropriately.
Locking is generally considered to have a high overhead, because every read or
write operation is preceded by a system locking request. The use of locks can also
cause two additional problems: deadlock and starvation. We discuss these problems
and their solutions in the next section.
read_lock(Y );
read_item(Y ); T1⬘ T2⬘
read_lock(X );
Time read_item(X ); Y
write_lock(X );
write_lock(Y );
Figure 21.5
Illustrating the deadlock problem. (a) A partial schedule of T1′ and T2′ that is
in a state of deadlock. (b) A wait-for graph for the partial schedule in (a).
5
These protocols are not generally used in practice, either because of unrealistic assumptions or
because of their possible overhead. Deadlock detection and timeouts (covered in the following sections)
are more practical.
21.1 Two-Phase Locking Techniques for Concurrency Control 791
waiting for, the directed edge is dropped from the wait-for graph. We have a state of
deadlock if and only if the wait-for graph has a cycle. One problem with this
approach is the matter of determining when the system should check for a dead-
lock. One possibility is to check for a cycle every time an edge is added to the wait-
for graph, but this may cause excessive overhead. Criteria such as the number of
currently executing transactions or the period of time several transactions have
been waiting to lock items may be used instead to check for a cycle. Figure 21.5(b)
shows the wait-for graph for the (partial) schedule shown in Figure 21.5(a).
If the system is in a state of deadlock, some of the transactions causing the deadlock
must be aborted. Choosing which transactions to abort is known as victim
selection. The algorithm for victim selection should generally avoid selecting trans-
actions that have been running for a long time and that have performed many
updates, and it should try instead to select transactions that have not made many
changes (younger transactions).
Timeouts. Another simple scheme to deal with deadlock is the use of timeouts.
This method is practical because of its low overhead and simplicity. In this method,
if a transaction waits for a period longer than a system-defined timeout period, the
system assumes that the transaction may be deadlocked and aborts it—regardless of
whether a deadlock actually exists.
Starvation. Another problem that may occur when we use locking is starvation,
which occurs when a transaction cannot proceed for an indefinite period of time
while other transactions in the system continue normally. This may occur if the
waiting scheme for locked items is unfair in that it gives priority to some transac-
tions over others. One solution for starvation is to have a fair waiting scheme, such
as using a first-come-first-served queue; transactions are enabled to lock an item
in the order in which they originally requested the lock. Another scheme allows
some transactions to have priority over others but increases the priority of a trans-
action the longer it waits, until it eventually gets the highest priority and proceeds.
Starvation can also occur because of victim selection if the algorithm selects the
same transaction as victim repeatedly, thus causing it to abort and never finish exe-
cution. The algorithm can use higher priorities for transactions that have been
aborted multiple times to avoid this problem. The wait-die and wound-wait
schemes discussed previously avoid starvation, because they restart a transaction
that has been aborted with its same original timestamp, so the possibility that the
same transaction is aborted repeatedly is slim.
21.2.1 Timestamps
Recall that a timestamp is a unique identifier created by the DBMS to identify a
transaction. Typically, timestamp values are assigned in the order in which the
transactions are submitted to the system, so a timestamp can be thought of as the
transaction start time. We will refer to the timestamp of transaction T as TS(T).
Concurrency control techniques based on timestamp ordering do not use locks;
hence, deadlocks cannot occur.
Timestamps can be generated in several ways. One possibility is to use a counter that
is incremented each time its value is assigned to a transaction. The transaction time-
stamps are numbered 1, 2, 3, … in this scheme. A computer counter has a finite
maximum value, so the system must periodically reset the counter to zero when no
transactions are executing for some short period of time. Another way to implement
timestamps is to use the current date/time value of the system clock and ensure that
no two timestamp values are generated during the same tick of the clock.