Concurrency :
In a Database Management System (DBMS),
Concurrency control is a mechanism in DBMS that allows
simultaneous execution of transactions while
maintaining ACID properties - Atomicity, Consistency,
Isolation and Durability. It maintains the integrity,
accuracy and reliability of data when multiple users or
processes perform read/write operations concurrently. It
helps manage issues like:
Conflicting operations on shared data
Inconsistent database states
Lost or incorrect updates
Prevent conflicts between simultaneous transactions.
Maintain data consistency and accuracy in multi-user
environments.
Avoid problems such as dirty reads, lost updates and
inconsistent reads.
Concurrency executions:
Concurrent Execution in DBMS
● In a multi-user system, multiple users can access and use the
same database at one time,
which is known as the concurrent execution of the database. It
means that the same
database is executed simultaneously on a multi-user system by
different users.
● While working on the database transactions, there occurs the
requirement of using the
database by multiple users for performing different operations,
and in that case,
concurrent execution of the database is performed.
● The thing is that the simultaneous execution that is performed
should be done in an
interleaved manner, and no operation should affect the other
executing operations, thus
maintaining the consistency of the database. Thus, on making the
concurrent execution of
the transaction operations, there occur several challenging
problems that need to be
solved.
Serializability:
Serializability in DBMS
What is Serializability?
Serializability of schedules ensures that a non-serial schedule is
equivalent to a serial schedule. It helps in maintaining the
transactions to execute simultaneously without interleaving one
another. In simple words, serializability is a way to check if the
execution of two or more transactions are maintaining the
database consistency or not.
Types of Serializability
Serializability of any non-serial schedule can be verified using two types
mainly:
1. Conflict Serializability
2. View Serializability.
What is Conflict Serializability in DBMS?
Conflict Serializability checks if a non-serial schedule is conflict serializable
or not. A non-serial schedule is conflict serializable if it can convert into a
serial schedule by swapping its non conflicting operations.
Conflicting Operations
A pair of operations are said to be conflicting operations if they follow the
set of conditions given below:
● Each operation is a part of different transactions.
● Both operations get performed on the same data item.
● One of the performed must be a write operation.
Let's consider two different transactions, Ti and Tj .Then considering the
above conditions, the following table follows:
Transactio Transactio IsConflictin
ni nj g
Readi (X) Readj (X) Non-
conflicting
Readi (X) Writej (X) Conflicting
Writei (X) Readj (X) Conflicting
Writei (X) Writej (X) Conflicting
Example
Let's see an example based on the following schedule.
Transactio Transactio
n1 n2
R1(A) W2(B)
W1(A)
R2(A)
R1(B) W2(A)
In the above schedule, we can notice that:
● W1(A) and R2(A) are part of different transactions.
● Both apply to the same data item, i.e., A.
● W1(A) is a write operation.
W1(A) and R2(A) are conflicting operations as they satisfy all the above
conditions.
Similarly, W1(A) and W2(A) are conflicting operations as they are part of
different transactions working on the same data item, and one of them is
the write operation.
W1(A) and W2(B) are non-conflicting operations as they work on different
data items and thus do not satisfy all the given conditions.
R1(A) and R2(A) are non-conflicting operations as none of them is a write
operation and thus does not satisfy the third condition.
W1(A) and R1(A) are non-conflicting as they belong to the same
transactions and thus do not satisfy the first condition.
Conflict Equivalent
If a schedule gets converted into another schedule by swapping the non-
conflicting operations, they are said to be conflict equivalent schedules.
Checking Whether a Schedule is Conflict Serializable Or Not
A non-serial schedule gets checked for conflict serializability using the
following steps:
1. Figure out all the conflicting operations and enlist them.
2. Create a precedence graph. For every transaction in the schedule, draw
a node in the precedence graph.
3. If Xi(A) and Yj(A) represent a conflict pair, then draw an edge from Ti to
Tj for each conflict pair. The precedence graph ensures that Ti gets
executed before Tj. 4. The next step involves checking for any cycle
formed in the graph. A schedule is a conflict serializable if there is no
cycle present.
Example:
We have a schedule "S" having three transactions t1, t2, and t3 working
simultaneously. Let's form a precedence graph.
t1 t t
2 3
R(
x)
R(y
)
R(y
)
W(
y)
W(
x)
W(
x)
R(x
)
W(
x)
As there is no loop in the graph, it is a conflict serializable schedule as well
as a serial schedule. Since it is a serial schedule, we can detect the order
of transactions as well.
The order of the Transactions: t1 will execute first as there is no incoming
edge on T1. t3 will execute second as it depends on T1 only. t2 will
execute at last as it depends on both T1 and T3.
So, order of its equivalent serial schedule is: t1 --> t3 --> t2
NOTE:
If a schedule is conflicting serializable, then it is surely a consistent
schedule. On the other hand, a non-conflicting serializable schedule may
or may not be serial. To further check its serial behavior, we use the
concept of View Serializability.
View Serializability in DBMS
What is View Serializability?
A system handles multiple transactions running concurrently, there is a
possibility that a few of them leaves the system databases in an
inconsistent state and thus, the system fails. To be aware of such
schedules and save the system from failures, we check if schedules are
serializable or not. If the given schedule is serializable implies it will never
leave the database in an inconsistent state.
Serial schedules are the schedules where no two transactions are
executed concurrently. Thus, serial schedules never leave the database in
an inconsistent state. If the given schedule is view serializable, then we
are sure that it is a consistent schedule or serial schedule.
View Serializability is a process used to check whether the given schedule
is view serializable or not. To do so we check if the given schedule is View
Equivalent to its serial schedule.
What is view equivalency?
The two conditions needed by schedules (S1 and S2) to be view
equivalent are:
1. Initial read must be on the same piece of data.
Example: If transaction t1 is reading "A" from database in schedule S1,
then in schedule S2, t1 must read A.
2. Final write must be on the same piece of data.
Example: If a transaction t1 updated A at last in S1, then in S2, t1 should
perform final write as well.
3. The mid sequence should also be in the same order.
Example: If t1 is reading A which is updated by t2 in S1, then in S2, t1
should read A which should be updated by t2.
This process of checking view equivalency of a schedule is called View
Serializability.
Example
Let us understand View-Serializability with an example of a schedule
S~1~.
T1 T2
R(X)
X=X+
10
W(A)
R(X)
X=X+
10
W(X)
R(Y)
Y=Y+
20
W(Y)
R(Y)
Y=Y*1
.1
W(Y)
Checking conflict serializability Precedence Graph of S1:
● R1(X), W2(X) (T1 → T2)
● R2(X), W1(X) (T2 → T1)
● W1(X), W2(X) (T1 → T2)
● R1(Y), W2(Y) (T1 → T2)
● R2(Y), W1(Y) (T2 → T1)
● W1(Y), W2(Y) (T1 → T2)
The conflict precedence graph is as follows:
The above graph contains cycle/loop thus, it is not conflict serializable but
by only considering the precedence graph we can't conclude if the given
schedule is consistent or not.
In the above example if we swap a few transaction operations, and format
the table as follows:
T1 T2
R(X)
X=X+
10
W(A)
R(Y)
Y=Y+
20
W(Y)
R(X)
X=X+
10
W(X)
R(Y)
Y=Y*1
.1
W(Y)
Now the precedence graph is as follows:
The precedence graph doesn't contain any cycle or loop which implies the
given schedule is conflict serializable and the result will be the same as
that of the first table.
Note: If a schedule is conflict serializable, then it will also be view-
serializable, consistent as well as equivalent to a serial schedule. The
view serializable schedule may or may not be conflict serializable but it is
consistent as well.
But if the given schedule does not conflict serializable, then we need to
check for View Serializability. If it is view serializable then it is consistent
else it is inconsistent. Thus, to address the limitations of conflict
serializability, the concept of view-serializability was introduced.
* Concurrency Control Protocols
To avoid concurrency control problems and to maintain consistency and
serializability during the execution of concurrent transactions some rules
are made. These rules are known as Concurrency Control Protocols.
There are 3 types of Concurrency Control Protocols
1. Lock-Based Protocols
2. Time-based Protocols
Lock-Based Protocol
Why Do We Need Locks?
Locks are essential in a database system to ensure:
1. Consistency: Without locks, multiple transactions could modify the
same data item simultaneously, resulting in an inconsistent state.
2. Isolation: Locks ensure that the operations of one transaction are
isolated from other transactions, i.e., they are invisible to other
transactions until the transaction is committed.
3. Concurrency: While ensuring consistency and isolation, locks also
allow multiple transactions to be processed simultaneously by the
system, optimizing system throughput and overall performance.
4. Avoiding Conflicts: Locks help in avoiding data conflicts that might
arise due to simultaneous read and write operations by different
transactions on the same data item.
5. Preventing Dirty Reads: With the help of locks, a transaction is
prevented from reading data that hasn't yet been committed by another
transaction.
In this type of protocol, any transaction cannot read or write data until it
acquires an appropriate lock on it. There are two types of lock:
Shared lock:
● It is also known as a Read-only lock. In a shared lock, the data item can
only be read by the transaction.
● It can be shared between the transactions because when the
transaction holds a lock, then it can't update the data on the data item.
Exclusive lock:
● In the exclusive lock, the data item can be both read as well as written
by the transaction. ● This lock is exclusive, and in this lock, multiple
transactions do not modify the same data simultaneously.
Lock Compatibility Matrix
A vital point to remember when using Lock-based protocols in Database
Management System is that a Shared Lock can be held by any amount of
transactions. On the other hand, an Exclusive Lock can only be held by
one transaction in DBMS, this is because a shared lock only reads data
but does not perform any other activities, whereas an exclusive lock
performs read as well as writing activities.
The figure given below demonstrates that when two transactions are
involved, and both of these transactions seek to read a specific data item,
the transaction is authorized, and no conflict occurs; but, in a situation
when one transaction intends to write the data item and another
transaction attempts to read or write simultaneously, the interaction is
rejected.
The two methods outlined below can be used to convert between the
locks:
1. Conversion from a read lock to a write lock is an upgrade.
2. Conversion from a write lock to a read lock is a downgrade.
There are four types of lock protocols available:
1. Simplistic lock protocol
It is the simplest way of locking the data while transaction. Simplistic lock-
based protocols allow all the transactions to get the lock on the data
before insert or delete or update on it. It will unlock the data item after
completing the transaction.
2. Pre-claiming Lock Protocol
● Pre-claiming Lock Protocols evaluate the transaction to list all the data
items on which they need locks.
● Before initiating an execution of the transaction, it requests DBMS for all
the locks on all those data items.
● If all the locks are granted then this protocol allows the transaction to
begin. When the transaction is completed then it releases all the locks.
● If all the locks are not granted then this protocol allows the transaction
to roll back and waits until all the locks are granted.
3. Two-phase locking (2PL)
○ The two-phase locking protocol divides the execution phase of the
transaction into three parts.
○ In the first part, when the execution of the transaction starts, it seeks
permission for the lock it requires.
○ In the second part, the transaction acquires all the locks. The third
phase is started as soon as the transaction releases its first lock.
○ In the third phase, the transaction cannot demand any new locks. It
only releases the acquired locks.
There are two phases of 2PL:
Growing phase: In the growing phase, a new lock on the data item may
be acquired by the transaction, but none can be released.
Shrinking phase: In the shrinking phase, existing locks held by the
transaction may be released, but no new locks can be acquired.
In the below example, if lock conversion is allowed then the following
phase can happen:
1. Upgrading of lock (from S(a) to X (a)) is allowed in the growing phase.
2. Downgrading of lock (from X(a) to S(a)) must be done in a shrinking
phase.
Two-phase locking helps to reduce the amount of concurrency in a
schedule Example:
The following way shows how unlocking and locking work with 2-
PL. Transaction T1:
○ Growing phase: from step 1-3
○ Shrinking phase: from step 5-7
○ Lock point: at 3
Transaction T2:
○ Growing phase: from step 2-6
○ Shrinking phase: from step 8-9
○ Lock point: at 6
Advantages of Two-Phase Locking (2PL)
● Ensures Serializability: 2PL guarantees conflict-serializability,
ensuring the consistency of the database.
● Concurrency: By allowing multiple transactions to acquire locks and
release them, 2PL increases the concurrency level, leading to better
system throughput and overall performance.
● Avoids Cascading Rollbacks: Since a transaction cannot read a value
modified by another uncommitted transaction, cascading rollbacks are
avoided, making recovery simpler.
Disadvantages of Two-Phase Locking (2PL)
● Deadlocks: The main disadvantage of 2PL is that it can lead to
deadlocks, where two or more transactions wait indefinitely for a resource
locked by the other.
● Reduced Concurrency (in certain cases): Locking can block
transactions, which can reduce concurrency. For example, if one
transaction holds a lock for a long time, other transactions needing that
lock will be blocked.
● Overhead: Maintaining locks, especially in systems with a large
number of items and transactions, requires overhead. There's a time cost
associated with acquiring and releasing locks, and memory overhead for
maintaining the lock table.
● Starvation: It's possible for some transactions to get repeatedly
delayed if other transactions are continually requesting and acquiring
locks.
Starvation
Starvation is the situation when a transaction needs to wait for an
indefinite period to acquire a lock.
Following are the reasons for Starvation:
● When waiting scheme for locked items is not properly managed
● In the case of resource leak
● The same transaction is selected as a victim repeatedly
Deadlock
Deadlock refers to a specific situation where two or more processes are
waiting for each other to release a resource or more than two processes
are waiting for the resource in a circular chain.
Categories of Two-Phase Locking in DBMS
1. Strict Two-Phase Locking
2. Rigorous Two-Phase Locking
3. Conservative (or Static) Two-Phase Locking:
Strict Two-phase locking (Strict-2PL)
○ The first phase of Strict-2PL is similar to 2PL. In the first phase, after
acquiring all the locks, the transaction continues to execute normally.
○ The only difference between 2PL and strict 2PL is that Strict-2PL does
not release a lock after using it.
○ Strict-2PL waits until the whole transaction to commit, and then it
releases all the locks at a time.
○ Strict-2PL protocol does not have a shrinking phase of lock release.
It does not have cascading abort as 2PL does.
Rigorous Two-Phase Locking
● A transaction can release a lock after using it, but it cannot commit until
all locks have been acquired.
● Like strict 2PL, rigorous 2PL is deadlock-free and ensures serializability.
Conservative or Static Two-Phase Locking
● A transaction must request all the locks it will ever need before it begins
execution. If any of the requested locks are unavailable, the transaction is
delayed until they are all available.
● This approach can avoid deadlocks since transactions only start when
all their required locks are available.
* [Link]-based Protocols
Timestamp based Protocol in DBMS is an algorithm which uses the System
Time or Logical Counter as a timestamp to serialize the execution of
concurrent transactions. The Timestamp-based protocol ensures that
every conflicting read and write operations are executed in a timestamp
order.
The older transaction is always given priority in this method. It uses
system time to determine the time stamp of the transaction. This is the
most commonly used concurrency protocol. Lock-based protocols help us
to manage the order between the conflicting transactions when they will
execute. Timestamp-based protocols manage conflicts as soon as an
operation is created.
Example:
Suppose there are three transactions T1, T2, and T3.
T1 has entered the system at time 0010
T2 has entered the system at 0020
T3 has entered the system at 0030
Priority will be given to transaction T1, then transaction T2 and lastly
Transaction T3. Basic timestamp ordering protocol can be checked by the
following conditions:
1. When a transaction T performs a Write _item(X) operation check the
following conditions :
● If Read timestamp of data item X is greater than the Timestamp of the
Transaction T i.e R_TS(X) > TS(T) or if Write Timestamp of the data item X
is greater than the timestamp of the transaction i.e W_TS(X) > TS(T),
then abort and rollback T and reject the operation. else,
● Execute the Write_ item(X) operation of T and set W_TS(X) to TS(T).
2. When a transaction T performs a Read_ item(X) operation check the
following conditions :
● If Write Timestamp of data item X is greater than the timestamp of the
transaction T i.e W_TS(X) > TS(T), then abort and reject T and reject the
operation, else
● If Write Timestamp of data item X is less than or equal to the timestamp
of the transaction T i.e W_TS(X) <= TS(T), then execute the R_item(X)
operation of T and set R_TS(X) to the larger of TS(T) and current R_TS(X).
Whenever there is a conflicting operation that violates the timestamp
ordering protocol then the later operation is rejected and the transaction
is aborted. Schedules created by the Basic timestamp ordering protocol
are conflict serializable and deadlock-free.
One of the drawbacks of the Basic timestamp ordering protocol is that
cascading Rollback is possible in the timestamp ordering protocol. For
Example, if transactions T1 and T2 use a value written by T1. If T1 aborts
and rolls back before committing the transaction then T2 must also be
aborted and rolled back. So cascading problems may occur in the Basic
timestamp ordering protocol.
Strict Timestamp Ordering
strict timestamp ordering is a variation of basic timestamp ordering. Strict
timestamp ordering ensures that the transaction is both strict and
conflicts serializable. In Strict timestamp ordering a transaction T that
issues a Read_ item(X) or Write_ item(X) such that TS(T) > W_TS(X) has its
read or write operation delayed until the Transaction T‘that wrote the
values of X has committed or aborted.
Thomas write Rule
Thomas Write Rule provides the guarantee of serializability order for the
protocol. It improves the Basic Timestamp Ordering Algorithm.
The basic Thomas write rules are as follows:
● If TS(T) < R_TS(X) then transaction T is aborted and rolled back, and
operation is rejected.
● If TS(T) < W_TS(X) then don't execute the W_ item(X) operation of the
transaction and continue processing.
● If neither condition 1 nor condition 2 occurs, then allowed to execute
the WRITE operation by transaction Ti and set W_TS(X) to TS(T).
Advantages
● Timestamp-based protocols ensure serializability as the transaction is
ordered on their creation timestamp.
● No deadlock occurs when timestamp ordering protocol is used as no
transaction waits. ● No older transaction waits for a longer period of time
so the protocol is free from deadlock.
● Timestamp based protocol ensures that there are no conflicting items in
the transaction execution.
Disadvantages
● Timestamp-based protocols may not be cascade free or recoverable.
● In timestamp based protocol there is a possibility of starvation of long
transactions if a sequence of conflicting short transactions causes
repeated restarting of the long transaction.