Chapter 3: Advanced Database Management
G2CS&IT
Concurrency Control Techniques
Concurrency control techniques are those techniques used to ensure the isolation property of
concurrently executing transactions. Most of the techniques ensure serializability of schedules by
using protocols that guarantees serializability. Some of the main techniques used to control
concurrent execution of transactions are based on the concept of locking data items.
A lock is a variable associated with a data item that describes the status of the item with respect to
possible operations that can be applied to it.
One lock with each data item in the database.
Locks are used to synchronize the access to the data item.
Concurrency Control with Locking Methods
A lock guarantees exclusive use of a data item to a current transaction. In other words, transaction
T2 does not have access to a data item that is currently being used by transaction T1. A
transaction acquires a lock prior to data access; the lock is released (unlocked) when the
transaction is completed so that another transaction can lock the data item for its exclusive use.
This series of locking actions assumes that there is a likelihood of concurrent transactions
attempting to manipulate the same data item at the same time. The use of locks based on the
assumption that conflict between transactions is likely to occur is often referred to as pessimistic
locking.
Therefore, locks are required to prevent another transaction from reading inconsistent
data. Most multiuser DBMSs automatically initiate and enforce locking procedures. All lock
information is managed by a lock manager, which is responsible for assigning and controling the
locks used by the transactions
Lock Granularity: indicates the level of lock use. Locking can take place at the following levels:
database, table, page, row, or even field (attribute).
Database Level
In a database-level lock, the entire database is
locked, thus preventing the use of any tables
in the database by transaction T2 while
transaction Tl is being executed. This level of
locking is unsuitable for multiuser DBMSs.
Data access will be very s-l-o-w. In figure
note that because of the database-level lock,
transactions T1 and T2 cannot access the same
database concurrently even when they use
different tables.
Table Level
In a table-level lock, the entire table is locked,
preventing access to any row by transaction T2
while transaction T1 is using the table. T2 must
wait until T1 unlocks the table. If a transaction
requires access to several tables, each table may
be locked. However, two transactions can
access the same database as long as they
access different tables. Table-level locks,
while less restrictive than database-level locks,
cause traffic jams when many transactions are
waiting to access the same table.. Consequently, table-level locks are not suitable for multiuser
DBMSs. The above figure illustrates the effect of a table-level lock.
Prepared by : [Link].M.T Page 1
Chapter 3: Advanced Database Management
G2CS&IT
Page Level
In a page-level lock, the DBMS will lock an
entire diskpage. A diskpage, or page, is the
equivalent of a diskblock, which can be described
as a directly addressable section of a disk. A table
can span several pages, and a page can contain
several rows of one or more tables. Page-level
locks are currently the most frequently used
multiuser DBMS locking method. An example of
a page-level lock is shown in Figure Note that T1
and T2 access the same table while locking
different diskpages. If T2 requires the use of a row
located on a page that is locked by T1, T2 must
wait until the page is unlocked by T1.
Row Level
A row-level lock is much less restrictive than the
locks discussed earlier. The DBMS allows
concurrent transactions to access different rows of
the same table even when the rows are located
on the same page. Although the row-level
locking approach improves the availability of
data, its management requires high overhead
because a lock exists for each row in a table of the
database involved in a conflicting transaction.
Modern DBMSs automatically changes a lock
from a row-level to a page-level lock when the application session requests multiple locks on the
same page.
The field-level lock allows concurrent transactions to access the same row as long as they require
the use of different fields (attributes) within that row. Although field-level locking clearly yields
the most flexible multiuser data access, it is rarely implemented in a DBMS because it requires an
extremely high level of computer overhead and because the row-level lock is much more useful in
practice.
Types of Locks
Regardless of the level of locking, the DBMS may use different lock types: binary or
shared/exclusive.
Binary Locks
A binary lock has only two states/values: locked (1) or unlocked (0). Every database
operation requires that the affected object be locked. If an object—that is, a database, table,
page, or row—is locked by a transaction, no other transaction can use that object. If an object
is unlocked, any transaction can lock the object for its use. As a rule, a transaction must
unlock the object after its termination.
Two operations, lock_item and unlock_item, are used with binary locking. A transaction
requests access to an item X(either for reading or writing) by first issuing a lock_item(X)
operation. If LOCK(X) =1, the transaction is forced to wait. If LOCK(X) = 0, it is set to 1
(the transaction locks the item) and the transaction is allowed to access item X. When the
transaction completed using the item, it issues an unlock_item(X) operation, which sets
LOCK(X) back to 0 (unlocks the item) so that X may be accessed by other transactions.
Hence, a binary lock enforces mutual exclusion on the data item.
Such operations are automatically managed and scheduled by the DBMS; The DBMS has a
lock manager subsystem to keep track of and control access to locks. The user does not
Prepared by : [Link].M.T Page 2
Chapter 3: Advanced Database Management
G2CS&IT
need to be concerned about locking or unlocking data items. (Every DBMS has a default
locking mechanism. If the end user wants to override the default, the LOCK TABLE and
other SQL commands are available for that purpose.)
Lock table manages the record of the locked items.
Lock and unlock features eliminate the lost update problem because the lock is not released
until the WRITE statement is completed.
Binary locks are now considered too restrictive to yield optimal concurrency conditions
and so are not used much in practice. For example, the DBMS will not allow two
transactions to read the same database object even though both transactions update the
database, and therefore, no concurrency problems can occur.
Shared/Exclusive Locks (Read/Write Locks)
The labels “shared” and “exclusive” indicate the nature of the lock. An exclusive lock exists
when access is reserved specifically for the transaction that locked the object. The exclusive lock
must be used when the potential for conflict exists. A shared lock exists when concurrent
transactions are granted read access on the basis of a common lock. A shared lock produces no
conflict as long as all the concurrent transactions are read-only.
Multiple-mode lock.
There are three locking operations: read_lock(X), write_lock(X), and unlock(X).
A lock associated with an item X, LOCK(X), now has three possible states: read-locked,
write-locked,or unlocked.
A read-locked item is also called share-locked because other transactions are allowed to read
the item, whereas a write-locked item is called exclusive-locked because a single transaction
exclusively holds the lock on the item.
A shared lock is issued when a transaction wants to read data from the database and no
exclusive lock is held on that data item.
An exclusive lock is issued when a transaction wants to update (write) a data item and no
locks are currently held on that data item by any other transaction.
Shared locks allow several read transactions to read the same data item concurrently. For
example, if transaction T1 has a shared lock on data item X and transaction T2 wants to read
data item X, T2 may also obtain a shared lock on data item X.
If transaction T2 updates data item X, an exclusive lock is required by T2 over data item X.
The exclusive lock is granted if and only if no other locks are held on the data item.
Therefore, if a shared or exclusive lock is already held on data item X by transaction T1, an
exclusive lock cannot be granted to transaction T2 and T2 must wait to begin until T1
commits. This condition is known as the mutual exclusive rule: only one transaction at a
time can own an exclusive lock on the same object.
Although the use of shared locks renders data access more efficient, a shared/exclusive lock
schema increases the lock manager’s overhead, for several reasons:
1. The type of lock held must be known before a lock can be granted.
2. Three lock operations exist: READ_LOCK (to check the type of lock), WRITE_LOCK (to
issue the lock), and UNLOCK (to release the lock).
3. The schema has been enhanced to allow a lock upgrade (from shared to exclusive) and a
lock downgrade (from exclusive to shared).
Conversion of Locks
A Transaction is allowed under certain conditions to convert the lock from one state to another.
• Upgrading:
– Convert the lock from shared to exclusive by issuing write_lock(X) after its
read_lock(X).
– The transaction must be the only one has the read lock or it must wait.
• Downgrading:
– Convert the lock from exclusive to shared by issuing read_lock(X) after the
write_lock(X).
Prepared by : [Link].M.T Page 3
Chapter 3: Advanced Database Management
G2CS&IT
Locks prevent data inconsistencies, but they can lead to two major problems:
The resulting transaction schedule might not be serializable.
The schedule might create deadlocks. (A deadlock occurs when two transactions wait
indefinitely for each other to unlock data. )
Additional protocols concerning the positioning of locking and unlocking operations are
followed in every transaction to solve the above mentioned issues.
Two-Phase Locking (2PL) to Ensure Serializability
Two-phase locking defines how transactions acquire and relinquish (release) locks. Two-phase
locking guarantees serializability. If every transaction in a schedule follows the two phase locking
protocol, the schedule is guaranteed to be serializable.
A transaction is said to follow the 2PL protocol if all locking operations precede the first
unlock operation of the transaction.
The transaction is divided into two phases:
• Growing or Expanding phase (first phase) where new locks can be issued and none
can be released. Once all locks have been acquired, the transaction is in its locked
point.
• Shrinking phase (second phase) where existing locks can be released and cannot
obtain any new lock.
• In case of lock conversion:
– Upgrading must be during Growing phase.
– Degrading must be during shrinking phase.
Example:
T1 T2
read_lock(Y); read_lock(X);
read_item(Y); read_item(X);
write_lock(X); write_lock(Y);
unlock(Y); unlock(X);
read_item(X); read_item(Y);
X:=X+Y; Y:=X+Y;
write_item(X); write_item(Y);
unlock(X): unlock(Y);
2PL protocol guarantees the serializability of schedule without the need of testing it.
2PL limits the concurrency.
• A transaction T may not be able to release an item X even after it is through using
it but needs to lock another item later. Another transaction that needs X has to wait
even though T is done with X.
The algorithm is called the Basic 2PL and there are many variations to it.
Conservative 2PL:
• It is called also Static 2PL.
• It requires a transaction to lock all the items it accesses before the transaction
begins execution (by pre-declaring its read and write sets).
• The read-set of a transaction is the set of all items that the transaction reads, and
the write-set is the set of all items that it writes.
• If any of the predeclared items needed cannot be locked, the transaction does not
lock any item; instead, it waits until all the items are available for locking.
• It is difficult in practice because it is not possible in most cases to get the read and
write sets.
• The conservative 2PL is a deadlock-free protocol.
Strict 2PL:
• It is the most popular variation of 2PL in practice.
• It guarantees strict schedule.
Prepared by : [Link].M.T Page 4
Chapter 3: Advanced Database Management
G2CS&IT
•
It requires that a transaction T does not release any of its exclusive locks until it
commits or aborts.
• Strict 2PL is not a deadlock-free protocol.
Rigorous 2PL:
• It is a more restrictive variation of Strict 2PL.
• It guarantees strict schedule also.
• It requires that a transaction T does not release any of its locks (shared or
exclusive) until it commits or aborts.
• It is easier to be implemented than the strict 2PL.
Notes:
• The concurrency control subsystem of the DBMS is responsible for generating the
required locks according to the locking protocol used.
• Although the 2PL protocol guarantees serializability, it does not permit all possible
schedules.
• The use of locks can lead to two additional problems: Deadlock – Starvation.
Deadlocks
A deadlock occurs when two transactions wait indefinitely for each other to unlock data. For
example, a deadlock occurs when two
transactions, T1 and T2, exist in the following
mode:
T1 = access data items X and Y
T2 = access data items Y and X
If T1 has not unlocked data item Y, T2 cannot
begin; if T2 has not unlocked data item X, T1
cannot continue. Consequently, T1 and T2 each
wait for the other to unlock the required data
item. Note that deadlocks are possible only
when one of the transactions wants to obtain an
exclusive lock on a data item; no deadlock
condition can exist among shared locks.
The three basic techniques to control deadlocks are:
I. Deadlock Prevention Protocols.
II. Deadlock Detection Protocols.
III. Timeouts.
I. Deadlock Prevention Protocols:
a. Using the conservative 2PL (not practical).
b. Using the concept of transaction Timestamps.
c. Nowaiting Protocols (NW).
d. Cautious waiting Protocol.
a. Using the Conservative 2PL protocol:
– The protocol had been discussed and it was shown it is a deadlock-free protocol.
– The protocol is not practical because of the need of the predeclared read and writes sets.
b. The concept of transaction Timestamp:
– It is used to decide if the transaction involved in a deadlock situation should wait,
abort or preempt another transaction.
– The timestamp of a transaction T is TS(T) which is a unique identifier assigned to
the transaction T and is based on the order in which the transaction T is started.
Prepared by : [Link].M.T Page 5
Chapter 3: Advanced Database Management
G2CS&IT
– If T1 started before T2, then TS(T1) < TS(T2).[T1 is the older and T2 is the younger]
– There are two schemes to prevent deadlock which are wait-die and wound-wait.
– Wound-Wait:
Suppose that Ti tries to lock X but it is not able to because X is locked by Tj
with a conflicting lock.
The rule is: If TS(Ti) < TS(Tj), then abort Tj (Ti wounds Tj ) and restart it later with
the same timestamp; otherwise Ti is allowed to wait.
– Wait-Die:
Suppose that Ti tries to lock X but it is not able to because X is locked by
Tj with a conflicting lock.
The rule is: If TS(Ti) < TS(Tj), then Ti is allowed to wait; otherwise abort
Ti (Ti dies) and restart it later with the same timestamp.
Notes:
– Both protocols may cause some transactions to be aborted and restarted even they may
never actually cause a deadlock.
c. No Waiting Protocol :
– If a transaction is unable to obtain a lock, it is immediately aborted and then
restarted after a certain time delay without checking if a deadlock will actually
occur or not.
– The protocol can cause transactions to abort and restart needlessly.
d. Cautious Waiting Protocol:
– It has been proposed to reduce the number of needless aborts/restarts.
– Suppose that Ti tries to lock X but it is not able to because X is locked by Tj with a
conflicting lock:
The rule is:
• If Tj is not blocked (not waiting for some other locked item), then Ti is blocked
and allowed to wait; otherwise abort Ti.
II. Deadlock Detection:
– The DBMS periodically tests the database for deadlocks. If a deadlock is found, one of the
transactions (the “victim m”) is aborted (rolled back and restarted) and the other
transaction continues.
– The concept is to not enforce any restrictions on executing the transactions, but check if a
deadlock actually exists.
– It is a more practical set of protocols.
– It is a beneficial if the transactions are short or the load is light (conflicts are not expected
to highly exist).
– Wait-for graph is a simple way to detect the state of deadlock.
It is used to check the existence of a deadlock.
How to build Wait-for graph:
1. One node for each currently executing transaction.
2. If Ti is waiting to lock an item X that is currently locked by Tj, a directed edge
from Ti to Tj is created.
Prepared by : [Link].M.T Page 6
Chapter 3: Advanced Database Management
G2CS&IT
3. If Tj releases the lock of item that Ti was waiting for, the directed edge is dropped
from the graph.
If the graph has a cycle, the state of the deadlock exists.
– The system must abort some of the transactions when a deadlock had been detected.
– Selecting a transaction to be aborted is known as Victim Selection.
– The algorithm of Victim Selection should avoid selecting transactions that have been
running for a long time and that have performed many updates.
III. Timeouts: Another simple scheme to deal with Timestamp: A monotonically
deadlock is the use of timeouts. This method is increasing variable (integer)
practical because of its low overhead and indicating the age of an
simplicity. In this method, if a transaction waits operation or a transaction.
for a period longer than a system-defined timeout
period, the system assumes that the transaction
may be deadlocked and aborts it—regardless of whether a deadlock actually exists or not.
Starvation
I. Occurs when a transaction cannot proceed for an indefinite period of time while other
transactions in the system continue normally. This may occur if the waiting scheme for
locked items is unfair, giving priority to some transactions over others.
Solution:
1. To have a fair waiting scheme, such as using a first-come-first-served queue; Transactions
are enabled to lock an item in the order in which they originally requested the lock.
2. Allowing priorities for transaction but increase the priority of a transaction the longer it
waits, until it eventually gets the highest priority and proceeds.
II. It also may occur if the Victim Selection algorithm used in deadlock detection protocol
selects the same transaction as a victim repeatedly.
Solution:
1. The algorithm can use higher priorities for transactions that have been aborted multiple
times.
Note: Wait-die and Wound-wait protocols avoid starvation because they restart a transaction that
has been aborted with its same original timestamp, so the possibility that the same transaction is
aborted repeatedly is less.
The choice of the best deadlock control method to use depends on the database environment. For
example, if the probability of deadlocks is low, deadlock detection is recommended. If the
probability of deadlocks is high, deadlock prevention is recommended.
Concurrency Control Based on Timestamp Ordering
A different approach that guarantees serializability involves using transaction timestamps to order
transaction execution for an equivalent serial schedule.
Timestamps
Timestamp of transaction T, TS(T) is a unique identifier created by the DBMS to identify a
transaction. Timestamp values are assigned in the order in which the transactions are submitted to
the system, so a timestamp can be thought of as the transaction start time. Concurrency control
techniques based on timestamp ordering do not use locks; hence, deadlocks cannot occur.
Timestamps can be generated in several ways. Two possible ways are listed below:
Prepared by : [Link].M.T Page 7
Chapter 3: Advanced Database Management
G2CS&IT
Using a counter timestamp: A counter that is incremented each time its value is assigned
to a transaction and are numbered 1, 2, 3, ... in this scheme. A computer counter has a
finite maximum value, so the system must periodically reset the counter to zero.
Using date timestamps: The current date/time value of the system clock and ensure that
no two timestamp values are generated during the same tick of the clock.
The Timestamp Ordering Algorithm
The idea for this scheme is to order the transactions based on their timestamps. A schedule in
which the transactions participate is then serializable, and the only equivalent serial schedule
permitted has the transactions in order of their timestamp values. This is called timestamp
ordering (TO).
The algorithm must ensure that, for each item accessed by conflicting operations in the
schedule, the order in which the item is accessed does not violate the timestamp order. To do
this, the algorithm associates with each database item X two timestamp (TS) values:
RTS(X): The read timestamp of item X, which is the largest timestamp among all the
timestamps of transactions that have successfully read item X—that is, RTS(X) = TS(T),
where T is the youngest transaction that has read X successfully.
o The Timestamp on which object X was last read by some transaction Ti.
o i.e., RTS(X):=TS(Ti)
WTS(X): The write timestamp of item X, which is the largest of all the timestamps of
transactions that have successfully written item X—that is, WTS(X) = TS(T), where T is
the youngest transaction that has written X successfully.
o The Timestamp on which object X was last written by some transaction Tj.
o WTS(X):=TS(Tj).
Basic Timestamp Ordering (TO): Utilizes Timestamps to guarantee serializability of concurrent
transactions. Whenever a transaction T attempts to perform some action (read or write) on data
item X on timestamp TS(T). Basic Timestamp ordering decides whether T has to be aborted or
whether T can continue execution.
A data item X in the database has a RTS(X) and WTS(X) (recorded when the object was
last accessed for the given action).
If Pa(x) and Qb(x) are conflicting operations, of Ta and Tb for item x, then Pa(x) is
processed before Qb(x) if and only if TS (Ta) < TS (Tb).
Whenever some transaction T tries to issue a read_item(X) or a write_item(X) operation,
the basic TO algorithm compares the timestamp of T with RTS(X) and WTS(X) to ensure
that the timestamp order of transaction execution is not violated.
The concurrency control algorithm must check whether conflicting operations violate the
timestamp ordering in the following two cases:
Case 1 (Read): Transaction T issues a read(X) operation
o If TS(T) < WTS(X), then read(X) is rejected (as the TO rule is violated).
T has to abort and be rejected.
o If WTS(X) <= TS(T), then execute read(X) of T and update RTS(X) to the largest
of TS(T).
Case 2 (Write): Transaction T issues a write(X) operation
o If TS(T) < RTS(X) or if TS(T) < WTS(X), then write is rejected.
o If RTS(X) <= TS(T) or WTS(X) <= TS(T), then execute write(X) of T and
update WTS(X) to TS(T)
If this order is violated, then transaction T is aborted and resubmitted to the system as a
new transaction with a new timestamp. If T is aborted and rolled back, any transaction T1
that may have used a value written by T must also be rolled back. Similarly, any
transaction T2 that may have used a value written by T1 must also be rolled back, and so
on. This effect is known as cascading rollback and is one of the problems associated with
basic TO, since the schedules produced are not guaranteed to be recoverable. An
Prepared by : [Link].M.T Page 8
Chapter 3: Advanced Database Management
G2CS&IT
additional protocol must be enforced to ensure that the schedules are recoverable,
cascadeless, or strict.
Advantages of Basic TO Algorithm
Schedules are serializable (like 2PL protocols)
No waiting for transaction, thus, no deadlocks!
Disadvantages of Basic TO Algorithm
Schedule may not be recoverable (read uncomit. data)
o Solution: Utilize Strict TO Algorithm.
Starvation is possible (if the same transaction is continually aborted and restarted)
o Solution: Assign new timestamp for aborted transaction
Basic TO Algorithm Example
Consider the following scenario:
Two transactions T1 and T2
Initially RTS=0 and WTS=0 for data items X, Y
Timestamps are as follows: TS(T1)=10 and TS(T2)=20
T1(10) T2(20)
1. A1 = Read(X) 1. A1 = Read(X)
2. A1 = A1 – k 2. A1 = A1 * 1.01
3. Write(X, A1) 3. Write(X, A1)
4. A2 = Read(Y) 4. A2 = Read(Y)
5. A2 = A2 + k 5. A2 = A2 * 1.01
6. Write(Y, A2) 6. Write(Y, A2)
Is the schedule that follows serializable or not? Justify your answer using Basic TO
Algorithm.
T1(10) T2(20)
RTS(X) : 10 1. A1 = Read(X)
WTS(X) : 10 2. A1 = A1 – k
RTS(Y) : 0 3. Write(X, A1)
WTS(Y) : 0 1. A1 = Read(X) RTS(X) : 20
2. A1 = A1* 1.01 WTS(X) : 20
3. Write(X, A1) RTS(Y) : 0
WTS(Y) : 0
RTS(X) : 20
WTS(X) : 20
RTS(Y) : 10 4. A2 = Read(Y)
WTS(Y) : 10 5. A2 = A2 + k
6. Write(Y, A2)
4. A2 = Read(Y) RTS(X) : 20
5. A2 = A2 * 1.01 WTS(X) : 20
6. Write(Y, A2) RTS(Y) : 20
WTS(Y) : 20
The schedule given above is serializable
Prepared by : [Link].M.T Page 9
Chapter 3: Advanced Database Management
G2CS&IT
Is the schedule that follows serializable or not? Justify your answer using Basic TO
Algorithm.
T1(10) T2(20)
RTS(X) : 10 1. A1 = Read(X)
WTS(X) : 10 2. A1 = A1 – k
RTS(Y) : 0 3. Write(X, A1)
WTS(Y) : 0 1. A1 = Read(X)
2. A1 = A1* 1.01
3. Write(X, A1)
4. A2 = Read(Y) RTS(X) : 20
5. A2 = A2 * 1.01 WTS(X) : 20
6. Write(Y, A2) RTS(Y) : 20
4. A2 = Read(Y) WTS(Y) : 20
5. A2 = A2 + k
6. Write(Y, A2)
The schedule is NOT serializable. Reject T1 (TO rule violated) (Restart with new TS)
Note that there is no notion of RR-conflict. If TS(T) < RTS(X), then execute read(X) of T
and update RTS(X).
Strict Timestamp Ordering
The Basic T.O. algorithm guarantees serializability but not Recoverability. The Strict Timestamp
Ordering algorithm introduces recoverability.
Strict Schedule: A transaction can neither read nor write an uncommitted data item X.
Strict Timestamp Ordering: Extend the accept cases of the Basic Timestamp Ordering
algorithm by adding the requirement that a commit occurs before T proceeds with its operation.
i.e.,
Case 1: Transaction T issues a read(X) operation:
If WTS(X) < TS(T), then delay T until the transaction T’ that wrote or read X has terminated
(committed or aborted).
Case 2: Transaction T issues a write(X) operation:
If RTS(X) <= TS(T) or WTS(X) <= TS(T), then delay T until the transaction T’ that wrote or
read X has terminated (committed or aborted).
Prepared by : [Link].M.T Page 10
Chapter 3: Advanced Database Management
G2CS&IT
Multiversion Concurrency Control (MVCC)
Each user connected to the database sees a snapshot of the database at a particular instant in time.
Any changes made by a writer will not be seen by other users of the database until the changes
have been completed (or, in database terms: until the transaction has been committed.)
When an MVCC database needs to update an item of data, it will not overwrite the old data with
new data, but instead mark the old data as obsolete and add the newer version elsewhere. Thus
there are multiple versions stored, but only one is the latest. This allows readers to access the data
that was there when they began reading, even if it was modified or deleted part way through by
someone else. This approach maintains a number of versions of a data item and allocates the right
version to a read operation of a transaction. Thus unlike other mechanisms a read operation in this
mechanism is never rejected.
– When any transaction writes an item it writes a new version and the old version of the
item is retained.
– When any transaction reads an item, appropriate version has to be provided maintaining
serializability.
Disadvantage:
Significantly more storage (RAM and Disk) is required to maintain multiple versions.
To check unlimited growth of versions, a garbage collection is run periodically.
Multiversion technique based on timestamp ordering
Assume X1, X2, …, Xn are the version of a data item X created by a write operation of
transactions.
– Note: New version of Xi is created only by a write operation.
– With each Xi a RTS (read timestamp) and a WTS (write timestamp) are associated.
– RTS(Xi): The read timestamp of Xi is the largest of all the timestamps of transactions that
have successfully read version Xi .
– WTS(Xi): The write timestamp of Xi is the largest of all the timestamps of transactions
that have successfully written the value of version Xi.
– Basic Idea: Works much like Basic TO with the difference that instead of WTS(X) and
RTS(X) we now utilize the highest WTS(Xi) and highest WTS(Xi) respectively.
o Whenever a transaction T is allowed to execute a write_item(X) operation, a new
version Xk+1 of item X is created, with
WTS(Xk+1) = TS(T).
RTS(Xk+1) = TS(T).
o When a transaction T is allowed to read the value of version Xk, the value of
RTS(Xk) = larger of the current RTS(Xk) and TS(T).
To ensure serializability, the following rules are used
If transaction T issues read(X), find the version i of X that has the highest WTS(Xi) of all
versions of X that is also less than or equal to TS(T), then accept the read and update the
RTS(X) respectively.
If transaction T issues write(X) and version i of X has the highest WTS(Xi) of all versions of
X that is also less than or equal to TS(T), and TS(T) < RTS(Xi), then abort and roll-back T;
otherwise create a new version Xi and read_TS(X) = write_TS(Xj) = TS(T).
Prepared by : [Link].M.T Page 11
Chapter 3: Advanced Database Management
G2CS&IT
Optimistic Concurrency Control (Validation/Certification Techniques)
Locking and Timestamp Ordering are pessimistic ways to handle concurrency. In both the
techniques a certain degree of checking is done before a database operation can be executed. For
example, in locking, a check is done to determine whether the item being accessed is locked. In
timestamp ordering, the transaction timestamp is checked against the read and write timestamps
of the item. Such checking represents overhead during transaction execution, with the effect of
slowing down the transactions. In optimistic concurrency control techniques no checking is
done while the transaction is executing. When most transactions don’t conflict with the other
transactions then Optimistic Concurrency Control is much more efficient.
Basic Idea:
– In Optimistic Concurrency Control, a schedule is checked against serializability only at
the time of commit. (e.g., using timestamp orders or some other mechanism). Transactions
are aborted in case of non-serializable schedules.
There are three phases for this concurrency control protocol:
1. Read & Execution phase: A transaction can read values of committed data items from
the database. However, updates are applied only to local copies (versions) of the data
items kept in the transaction workspace. Transaction execution proceeds with a minimum
of overhead until the validation phase is reached.
2. Validation phase: Checking is performed to ensure that serializability will not be violated
if the transaction updates are applied to the database. A set of checking condition are
followed. Certain information needed by the validation phase must be kept by the system
such as transaction timestamps, write_sets and read_sets of the transactions, and start and
end times of each phases. (write_set of a transaction is the set of items it writes, and the
read_set is the set of items it reads)
– The validation phase for transaction Ti checks that, for each transaction Tj that is
either committed or is in its validation phase, one of the following conditions
holds:
1. Transaction Ti starts its read phase only after Transaction Tj completes its write
phase.
2. Transaction Ti starts its write phase after Tj completes its write phase, and the
read_set of Ti has no items in common with the write_set of Tj
3. Both the read_set and write_set of Ti have no items in common with the
write_set of Tj and T j completes its read phase before Ti completes its read
phase.
– When validating transaction Ti, the first condition is checked first for each
transaction Tj.
– Only if condition 1 is false, condition 2 is checked.
– Only if condition 2 is false, condition 3 is checked.
– If any one of these three conditions holds, there is no interference and Ti is
validated successfully. If none of these three conditions holds, the validation of
transaction Ti fails and it is aborted and restarted later because interference may
have occurred.
3. Write phase: If the validation phase is successful, the transaction updates are applied to
the database; otherwise, the updates are discarded and the transaction is restarted.
Note:
– If there is little interference among transactions, most will be validated successfully.
– If there is much interference, many transactions that execute to completion will have their
results discarded and must be restarted later.
– The techniques are called optimistic because they assume that little interference will occur
and hence that there is no need to do checking during transaction execution.
Prepared by : [Link].M.T Page 12
Chapter 3: Advanced Database Management
G2CS&IT
Summary
Concurrency control coordinates the simultaneous execution of transactions. The concurrent
execution of transactions can result in three main problems: lost updates, uncommitted data,
and inconsistent retrievals.
The scheduler is responsible for establishing the order in which the concurrent transaction
operations are executed. The transaction execution order is critical and ensures database
integrity in multiuser database systems. Locking, time stamping, and optimistic methods are
used by the scheduler to ensure the serializability of transactions.
A lock guarantees unique access to a data item by a transaction. The lock prevents one
transaction from using the data item while another transaction is using it. There are several
levels of locks: database, table, page, row, and field.
Two types of locks can be used in database systems: binary locks and shared/exclusive locks.
A binary lock can have only two states: locked (1) or unlocked (0). A shared lock is used when
a transaction wants to read data from a database and no other transaction is updating the same
data. Several shared or “read” locks can exist for a particular item. An exclusive lock is issued
when a transaction wants to update (write to) the database and no other locks (shared or
exclusive) are held on the data.
Serializability of schedules is guaranteed through the use of two-phase locking. The two-phase
locking schema has a growing phase, in which the transaction acquires all of the locks that it
needs without unlocking any data, and a shrinking phase, in which the transaction releases all
of the locks without acquiring new locks.
When two or more transactions wait indefinitely for each other to release a lock, they are in a
deadlock, also called a deadly embrace. There are three deadlock control techniques:
prevention, detection, and avoidance.
Concurrency control with time stamping methods assigns a unique time stamp to each
transaction and schedules the execution of conflicting transactions in time stamp order. Two
schemes are used to decide which transaction is rolled back and which continues executing: the
wait/die scheme and the wound/wait scheme.
Concurrency control with optimistic methods assumes that the majority of database
transactions do not conflict and that transactions are executed concurrently, using private,
temporary copies of the data. At commit time, the private copies are updated to the database.
MVCC approach maintains a number of versions of a data item and allocates the right version
to a read operation of a transaction. Thus unlike other mechanisms a read operation in this
mechanism is never rejected.
Prepared by : [Link].M.T Page 13