Explain tranction states in DBMS with block diagram.
Transaction states represent the different phases a transaction goes through during its lifecycle
in a database system. These states help track the progress and outcome of a transaction to
ensure data consistency and integrity.
Different Types of Transaction States in DBMS
These are the different types of Transaction States:
1. Active State
2. Partially Committed State
3. Committed State
4. Failed State
5. Aborted State
6. Terminated State
Transaction States
1. Active State
It is the first stage of any transaction when it has begun to execute. The execution of the
transaction takes place in this state.
Operations such as insertion, deletion, or updation are performed during this state.
During this state, the data records are under manipulation and they are not saved to the
database, rather they remain somewhere in a buffer in the main memory.
2. Partially Committed
The transaction has finished its final operation, but the changes are still not saved to the
database.
After completing all read and write operations, the modifications are initially stored in
main memory or a local buffer. If the changes are made permanent in the database then the
state will change to "committed state" and in case of failure it will go to the "failed state".
3. Committed
This state of transaction is achieved when all the transaction-related operations have been
executed successfully along with the Commit operation, i.e. data is saved into the database
after the required manipulations in this state. This marks the successful completion of a
transaction.
4. Failed State
If any of the transaction-related operations cause an error during the active or partially
committed state, further execution of the transaction is stopped and it is brought into a failed
state. Here, the database recovery system makes sure that the database is in a consistent state.
5. Aborted State
If a transaction reaches the failed state due to a failed check, the database recovery system will
attempt to restore it to a consistent state. If recovery is not possible, the transaction is either
rolled back or cancelled to ensure the database remains consistent.
In the aborted state, the DBMS recovery system performs one of two actions:
Kill the transaction: The system terminates the transaction to prevent it from affecting
other operations.
Restart the transaction: After making necessary adjustments, the system reverts the
transaction to an active state and attempts to continue its execution.
6. Terminated State
It refers to the final state of a transaction, indicating that it has completed its execution. Once a
transaction reaches this state, it has either been successfully committed or aborted. In this state,
no further actions are required from the transaction, as the database is now stable.
What is Concurrency Control ? Explain.
Concurrency control concept comes under the Transaction in database management
system (DBMS).
It is a procedure in DBMS which helps us for the management of two simultaneous
processes to execute without conflicts between each other, these conflicts occur in multi
user systems.
Concurrency can simply be said to be executing multiple transactions at a time. It is
required to increase time efficiency.
If many transactions try to access the same data, then inconsistency arises. Concurrency
control required to maintain consistency data.
For example, if we take ATM machines and do not use concurrency, multiple persons
cannot draw money at a time in different places. This is where we need concurrency.
Advantages
The advantages of concurrency control are as follows −
Waiting time will be decreased.
Response time will decrease.
Resource utilization will increase.
System performance & Efficiency is increased.
Control concurrency
The simultaneous execution of transactions over shared databases can create
several data integrity and consistency problems.
For example, if too many people are logging in the ATM machines, serial updates
and synchronization in the bank servers should happen whenever the transaction is
done, if not it gives wrong information and wrong data in the database.
Concurrency control techniques
The concurrency control techniques are as follows −
Locking
Lock guaranties exclusive use of data items to a current transaction. It first
accesses the data items by acquiring a lock, after completion of the transaction it
releases the lock.
Types of Locks
The types of locks are as follows −
1. Shared Lock [Transaction can read only the data item values]
It is also known as a Read-only lock. In a shared lock, the data item can only
read by the transaction.
It can be shared between the transactions because when the transaction holds a
lock, then it can't update the data on the data item.
2. Exclusive Lock [Used for both read and write data item values]
In the exclusive lock, the data item can be both reads as well as written by
the transaction.
This lock is exclusive, and in this lock, multiple transactions do not modify
the same data simultaneously
Concurrency Control Protocols
Concurrency control protocols define rules to ensure correct and consistent execution of
transactions. The main protocols are:
Concurrency Control Protocols
1. Lock-Based Concurrency Control:
Uses locks to restrict access to data items during a transaction. Common types include
shared locks (read) and exclusive locks (write).
Ensures serializability and prevents conflicts.
Example: Two-Phase Locking (2PL) guarantees that once a transaction releases a lock, it
cannot obtain any new locks.
2. Timestamp-Based Concurrency Control:
Each transaction is assigned a timestamp.
The DBMS uses these timestamps to order transactions and prevent conflicts based on their
start time.
Explain dead lock with its characteristics.
A deadlock occurs in a multi-user database environment when two or more transactions block
each other indefinitely by each holding a resource the other needs. This results in a cycle of
dependencies (circular wait) where no transaction can proceed.
For Example: Consider the image below
Deadlock in DBMS
In the above image, we can see that:
T1 locks Resource "Student" and needs Resource "Grade"
T2 locks Resource "Grade" and needs Resource "Student"
T1 waits for T2, T2 waits for T1, hence resulting in a deadlock
Necessary Conditions of Deadlock
For a deadlock to occur, all four of these conditions must be true:
Mutual Exclusion: Only one transaction can hold a particular resource at a time.
Hold and Wait: The Transactions holding resources may request additional resources held
by others.
No Preemption: The Resources cannot be forcibly taken from the transaction holding them.
Circular Wait: A cycle of transactions exists where each transaction is waiting for the
resource held by the next transaction in the cycle.
Why Deadlocks Are a Problem?
Transactions are stuck indefinitely.
System throughput decreases as transactions remain blocked.
Resources are held unnecessarily, preventing other operations.
Can lead to performance bottlenecks or even system-wide standstill if not handled.
Real-Life Example
Transaction T1:
Locks rows in Students
Wants to update rows in Grades
Transaction T2:
Locks rows in Grades
Wants to update rows in Students
Both wait on each other and this results in deadlock, and all database activity comes to a
standstill.
How to Handle Deadlocks
There are some approaches and by ensuring them, we can handle deadlocks. They are discussed
below:
1. Deadlock Avoidance
Plan transactions in a way that prevents deadlock from occurring.
Methods:
Access resources in the same order. For e.g., always access Students first, then Grades
Use row-level locking and READ COMMITTED isolation level. It reduces chances, but
doesn’t eliminate [Link]
2. Deadlock Detection
If a transaction waits too long, the DBMS checks if it’s part of a deadlock.
Method: Wait-For Graph
Nodes: Transactions
Edges: Waiting relationships
If there’s a cycle, a deadlock exists. It's mostly suitable for small to medium databases
3. Deadlock Prevention
For a large database, the deadlock prevention method is suitable. A deadlock can be prevented
if the resources are allocated in such a way that a deadlock never occurs. The DBMS analyzes
the operations whether they can create a deadlock situation or not, If they do, that transaction is
never allowed to be executed.
Deadlock prevention mechanism proposes two schemes:
1. Wait-Die Scheme (Non-preemptive)
Older transactions are allowed to wait.
Younger transactions are killed (aborted and restarted) if they request a resource held by an
older one
For example:
Consider two transaction- T1 = 10 and T2 = 20
If T1 (older) wants a resource held by T2 → T1 waits
If T2 (younger) wants a resource held by T1 → T2 dies and restarts
Prevents deadlock by not allowing a younger transaction to wait and form a wait cycle.
2. Wound-Wait Scheme (Preemptive)
Older transactions are aggressive (preemptive) and can force younger ones to abort.
Younger transactions must wait if they want a resource held by an older one.
For example:
Consider two transaction- T1 = 10 and T2 = 20
If T1 (older) wants a resource held by T2 → T2 is killed, T1 proceeds.
If T2 (younger) wants a resource held by T1 → T2 waits
Prevents deadlock by not allowing younger transactions to block older ones.
Explain :
(i) Serializability
Serializability is a crucial concept in database management systems (DBMS) used to ensure that
the concurrent execution of multiple transactions results in the same outcome as some sequential
execution of those same transactions [1]. This property is essential for maintaining database
consistency and integrity in multi-user environments.
Key Concepts
Concurrency: When multiple transactions run at the same time, the DBMS interleaves their
operations (reads and writes) [1, 2].
Isolation: One of the ACID properties (Atomicity, Consistency, Isolation, Durability), isolation
aims to hide the effects of an incomplete transaction from others [1]. Serializability is the gold
standard for achieving full isolation [1, 2].
Serial Schedule: A schedule where transactions run one after another, with no interleaving of
operations. A serial schedule is always correct because each transaction sees the database in a
consistent state left by the previous one [1].
Definition of Serializability
A concurrent schedule is serializable if its final outcome (the resulting state of the database) is
equivalent to the outcome of at least one possible serial schedule of the same transactions [1, 2].
Types of Serializability
The DBMS uses different methods to determine and enforce serializability:
1. Conflict Serializability: This is the most common and robust form. A schedule is conflict
serializable if it can be transformed into a serial schedule by swapping only non-conflicting
operations (e.g., two reads of the same data item, or operations on different data items) [1, 2].
Conflicts occur when two operations from different transactions access the same data item, and
at least one of them is a write operation [1].
2. View Serializability: A more relaxed definition that ensures each transaction reads the same
values in the concurrent schedule as it would in a serial schedule. This is harder to check in
practice than conflict serializability [2].
(ii) Buffer Management
Buffer management is a critical component of a Database Management System (DBMS) that
handles the transfer of data blocks between the computer's main memory (RAM) and the
secondary storage (disk). Its primary goal is to minimize the slow disk I/O operations and
improve overall database performance.
Key Concepts
Buffer Pool: A designated area in main memory reserved by the DBMS to hold copies of disk
blocks that are currently in use or anticipated to be needed soon.
Buffer Manager: A software system responsible for fetching data from disk into the buffer pool
when requested by the query processor, and writing updated data back to the disk when
necessary.
How It Works
1. Request: When the DBMS needs a specific data block, it asks the buffer manager.
2. Check Buffer Pool: The manager first checks if the block is already in the buffer pool.
1. Hit: If found (a buffer hit), the data is provided instantly from fast RAM.
2. Miss: If not found (a buffer miss), the manager must fetch it from the disk.
3. Replacement (If pool is full): If the buffer pool is full, the manager selects a block to replace
(evict) using a replacement policy (see below).
4. Fetching and Pinning: The required block is loaded from disk into the freed buffer frame. The
block is marked as "pinned," meaning it cannot be removed while being actively used.
5. Writing Back: If an evicted block was modified while in memory (it is "dirty"), the manager
must write the updated version back to the disk before overwriting the buffer frame.
Objectives of Buffer Management
Minimize Disk Access: The primary objective is to maximize the buffer hit rate to avoid slow
I/O.
Manage Limited Memory: It efficiently utilizes the limited RAM to store the most frequently
or strategically needed data blocks.
Common Replacement Policies
The choice of policy determines which block to evict when memory is full:
LRU (Least Recently Used): Evicts the block that has not been accessed for the longest time.
This is a common and generally effective policy.
FIFO (First-In, First-Out): Evicts the block that has been in the buffer pool the longest,
regardless of how often it's used.
Clock Policy (or Second Chance): A more efficient approximation of LRU that uses a reference
bit to give frequently used pages a second chance before eviction.
What are ACID properties ? Explain
Transactions are fundamental operations that allow us to modify and retrieve data. However, to
ensure the integrity of a database, it is important that these transactions are executed in a way
that maintains consistency, correctness, and reliability even in case of failures / errors. This is
where the ACID properties come into play.
ACID stands for Atomicity, Consistency, Isolation, and Durability.
Properties Of ACID:
There are Four Properties of ACID
1. Atomicity
Atomicity means a transaction is all-or-nothing either all its operations succeed, or none are
applied. If any part fails, the entire transaction is rolled back to keep the database consistent.
Commit: If the transaction is successful, the changes are permanently applied.
Abort/Rollback: If the transaction fails, any changes made during the transaction are
discarded.
Example: Consider the following transaction T consisting of T1 and T2 : Transfer of $100
from account X to account Y .
Atomicity
If the transaction fails after completion of T1 but before completion of T2, the database would
be left in an inconsistent state. With Atomicity, if any part of the transaction fails, the entire
process is rolled back to its original state, and no partial changes are made.
2. Consistency
Consistency in transactions means that the database must remain in a valid state before and
after a transaction.
A valid state follows all defined rules, constraints, and relationships (like primary keys,
foreign keys, etc.).
If a transaction violates any of these rules, it is rolled back to prevent corrupt or invalid
data.
If a transaction deducts money from one account but doesn't add it to another (in a transfer),
it violates consistency.
Example: Suppose the sum of all balances in a bank system should always be constant. Before
a transfer, the total balance is $700. After the transaction, the total balance should remain $700.
If the transaction fails in the middle (like updating one account but not the other), the system
should maintain its consistency by rolling back the transaction.
Total before T occurs = 500 + 200 = 700 .
Total after T occurs = 400 + 300 = 700 .
Consistency
3. Isolation
Isolation ensures that transactions run independently without affecting each other. Changes
made by one transaction are not visible to others until they are committed.
It ensures that the result of concurrent transactions is the same as if they were run one after
another, preventing issues like:
Dirty reads: reading uncommitted data
Non-repeatable reads: data changes between two reads
Phantom reads: new rows appear during a transaction
Example: Consider two transactions T and T''.
X = 500, Y = 500
Explanation:
1. Transaction T:
T wants to transfer $50 from X to Y.
T reads Y (value: 500), deducts $50 from X (new X = 450), and adds $50 to Y (new Y =
550).
2. Transaction T'':
T'' starts and reads X (500) and Y (500).
It calculates the sum: 500 + 500 = 1000.
Meanwhile, values of X and Y change to 450 and 550 respectively.
So, the correct sum should be 450 + 550 = 1000.
Isolation ensures that T'' does not read outdated values while another transaction (T) is still
in progress.
Transactions should be independent, and T'' should access the final values only after T
commits.
This avoids inconsistent results, like the incorrect sum calculated by T''.
4. Durability:
Durability ensures that once a transaction is committed, its changes are permanently saved,
even if the system fails. The data is stored in non-volatile memory, so the database can recover
to its last committed state without losing data.
Example: After successfully transferring money from Account A to Account B, the changes
are stored on disk. Even if there is a crash immediately after the commit, the transfer details
will still be intact when the system recovers, ensuring durability.