UNIT 04
Transaction Processing Concept
Transaction System in DBMS
A Transaction System in a Database Management System (DBMS) is a mechanism that
manages and coordinates multiple operations performed on a database, ensuring that
all operations are executed safely, reliably, and correctly — especially in multi-user
environments.
A transaction is the smallest logical unit of work in a DBMS.
Example: Transferring ₹1000 from Account A to Account B is one transaction.
A Transaction System makes sure that such operations follow strict rules to protect the
database from errors, failures, and inconsistencies.
Key Components of a Transaction System
1. Transaction Manager (TM)
• Controls the execution of transactions.
• Ensures transactions follow the ACID properties.
• Coordinates with:
o Concurrency Controller
o Recovery Manager
o Scheduler
2. Scheduler
• Determines the order of transaction execution.
• Prevents conflicts between parallel transactions.
• Produces serializable schedules.
3. Concurrency Control Manager
• Handles multi-user access.
• Prevents problems like:
o Dirty Read
o Unrepeatable Read
o Phantom Read
• Uses techniques like:
o Locking (2PL)
o Timestamp ordering
o Validation
4. Recovery Manager
• Restores database to correct state after:
o System crashes
o Transaction failures
• Uses:
o Logs
o Checkpoints
o Rollback/Roll forward
5. Logging System
• Every action of the transaction is recorded.
• Used for recovery during failures.
• Common format: Write-Ahead Logging (WAL)
Characteristics of a Transaction System
A proper transaction system must ensure:
1. Atomicity
• Transaction is all-or-nothing.
• If one step fails → whole transaction is rolled back.
2. Consistency
• Ensures the database moves from one valid state to another valid state.
3. Isolation
• Each transaction executes as if it is the only one.
• Prevents interference between transactions.
4. Durability
• Once a transaction commits, changes are permanent.
• Even if system crashes → data remains safe.
These four properties together are called ACID Properties.
Functions of a Transaction System
Ensures Correct Execution
Checks that every operation is valid and follows DB constraints.
Manages Concurrent Users
Allows thousands of users to work together safely without conflict.
Prevents Data Loss
Uses logs and recovery techniques to avoid losing important data.
Maintains Database Integrity
Protects data from corruption due to:
• Crashes
• Errors
• Deadlocks
• Incomplete operations
Why Transaction Systems Are Important?
A transaction system is essential for any database that requires:
• Banking systems
• Railway reservations
• Online shopping
• Payment gateways
• Stock management
• Cloud databases
Example of a Transaction
Example Transaction: Transfer ₹1000 from A to B
Steps:
1. Read A’s balance
2. Subtract 1000
3. Update A
4. Read B’s balance
5. Add 1000
6. Update B
7. Commit
If any step fails → Rollback occurs.
ACID Properties in DBMS
ACID stands for:
A – Atomicity
C – Consistency
I – Isolation
D – Durability
1. Atomicity (All or Nothing)
Definition
Atomicity ensures that a transaction is treated as a single indivisible unit.
Either all operations of the transaction are completed, or none are.
If any part fails → the entire transaction is rolled back.
Example
Transaction: Transfer ₹500 from A to B
Steps:
1. Read A’s balance
2. Deduct 500
3. Update A
4. Add 500 to B
5. Update B
If the database system crashes after Step 3 but before Step 4 →
Atomicity rolls back the changes to A’s account.
No partial transaction is allowed.
Result: Money does not get deducted unless added to B.
2. Consistency
Definition
Consistency ensures that the database always moves from one valid state to another
valid state after a transaction.
A transaction must follow:
• All constraints
• Data rules
• Triggers
• Integrity constraints (PK, FK, Check)
Example
Rule: Balance cannot become negative.
If a transaction tries to deduct ₹10,000 from an account with balance ₹5,000 →
This violates a constraint.
The transaction is aborted to maintain consistency.
Result: Database stays in a valid state.
3. Isolation
Definition
Isolation ensures that multiple transactions running at the same time do not affect each
other.
Each transaction should behave as if it is running alone in the system.
Isolation prevents issues like:
• Dirty Read
• Unrepeatable Read
• Phantom Read
Example
Two transactions running together:
• T1: Transfer ₹500 from A to B
• T2: Read balance of A
If T2 reads A’s balance after deduction but before T1 commits, it gets incorrect (dirty)
data.
Isolation ensures T2 will only see the final committed value, not intermediate steps.
4. Durability
Definition
Durability ensures that once a transaction is committed, its changes are permanent,
even if the system crashes immediately after.
The DBMS stores committed data in non-volatile memory like logs.
Example
Transaction commits a ₹500 transfer from A to B.
Just after commit → system power fails.
When the system restarts →
The updated balances for both A and B still exist.
DBMS recovery ensures committed data is not lost.
TESTING OF SERIALIZABILITY (IN DETAILS)
Serializability is the most important concept in concurrency control.
It tells us whether a given schedule of transactions is correct or not.
A schedule is serializable if its final result is same as some serial schedule of the same
transactions.
Two types of serializability testing are used:
1. Conflict Serializability
How to test?
We test conflict serializability using a Precedence Graph (also called Serialization
Graph).
A. Precedence Graph Construction Steps
A precedence graph is a directed graph, where:
• Nodes = Each transaction (T1, T2, T3…)
• Edges = Conflicts between transactions
An edge Ti → Tj means
Ti must come before Tj in any serial schedule.
B. When do two operations conflict?
Two operations conflict if:
1. They belong to different transactions, and
2. They access the same data item, and
3. At least one is a write.
Types of conflicts:
• Read–Write (RW)
• Write–Read (WR)
• Write–Write (WW)
Steps to Test Conflict Serializability
Step 1: List all conflicting operation pairs.
Step 2: For each conflict, draw an edge.
Step 3: Check the graph for cycles.
D. Rule
If the graph has NO cycle → Schedule is conflict serializable
If the graph has a cycle → Not conflict serializable
Example (Very Common Exam Question)
Schedule S:
T1: R(A) ,W(A)
T2: R(A) ,W(A)
● Step 1: Identify conflicts
Conflicting operations on A:
• T1: W(A) → T2: R(A) (WR conflict) → Edge: T1 → T2
• T1: W(A) → T2: W(A) (WW conflict) → Edge: T1 → T2
● Step 2: Precedence Graph
Nodes: T1, T2
Edges: T1 → T2
● Step 3: Cycle?
No
Schedule is Conflict Serializable
Equivalent serial order: T1 → T2
2. View Serializability (Second Type)
If a schedule is not conflict serializable, it may still be view serializable.
A Schedule is View Serializable if:
1. Initial Reads Match
Each transaction must read the same initial value as in the equivalent serial schedule.
2. Reads From Relation Match
If Ti reads value written by Tj in a schedule,
the same must happen in the serial schedule.
This is called Read-From relationship.
3. Final Writes Match
The final committed write on each data item must be the same as in the equivalent
serial order.
Testing View Serializability
Steps:
1. Identify who writes first on each item (initial write).
2. Identify who reads from whom (read-from relation).
3. Identify who writes last on each item (final write).
4. Check if there exists a serial schedule that satisfies all 3 rules.
5.
Note:
All conflict serializable schedules are view serializable
But some view serializable schedules are not conflict serializable.
Special Case: Blind Writes
A blind write = a transaction writes data without reading it first.
Example: W(A) only
Such schedules may be view serializable but not conflict serializable.
Example of View Serializable but NOT Conflict Serializable
Schedule S:
T1: W(A)
T2: R(A)
Conflicts:
• W(A) of T1 and R(A) of T2 → T1 → T2
• But no conflict forcing T2 → T1
This schedule may not be conflict serializable,
but can be view serializable as view conditions may be satisfied.
Schedule in DBMS (Transaction Scheduling)
A schedule in DBMS is the sequence (order) of operations from multiple transactions
executed by the database system.
Types of Schedules
Serial Schedule
Only one transaction executes at a time.
T1: R(X) W(X)
T2: R(Y) W(Y)
No conflicts
Always consistent
Slow
Non-Serial Schedule
Operations of multiple transactions are interleaved.
Multiple Transaction will be executed at a Time
Non-serial schedule is also known as parallel schedule.
Faster
May cause conflicts
Must be checked for serializability
Dirty Read Problem
A Dirty Read occurs in a database when a transaction reads uncommitted data written
by another transaction.
Or
This is a major concurrency problem because the uncommitted data may later be rolled
back, making the first transaction’s read invalid or incorrect.
Or
Dirty Read is a concurrency problem where one transaction reads data written by
another transaction before it is committed. If that data is rolled back, the read value
becomes invalid.
A Dirty Read happens when:
• Transaction T1 modifies some data but has not committed.
• Transaction T2 reads that modified data before T1 commits.
• If T1 rolls back, then the value read by T2 becomes invalid (dirty).
Unrepeatable Read Problem
An Unrepeatable Read occurs when a transaction reads the same row twice and gets
different values because another transaction modified or deleted that row in between.
Unrepeatable Read happens when:
• Transaction T1 reads a row.
• Transaction T2 modifies or deletes that row and commits.
• T1 reads the same row again and gets a different value.
Or
Unrepeatable Read is a concurrency problem where a transaction reads the same data
twice and gets different results because another transaction updated or deleted the
data in between.
Phantom Read Problem
A transaction reads a set of rows that satisfy some condition,
and later reads the same condition again, but new rows appear or some disappear
because another transaction inserted or deleted rows in between.
A Phantom Read occurs when a transaction re-executes a query and finds new or
missing rows because another transaction inserted or deleted rows in the meantime.
Lost Update Problem
A Lost Update occurs when two transactions read the same data and update it, but
one of the updates gets overwritten, causing loss of data.
It happens when transactions execute concurrently without proper locking.
A Lost Update Problem is a concurrency issue where:
• Two transactions (T1 and T2) read the same data.
• Both compute new values independently.
• The last write overwrites the previous write, causing one update to be lost.
Or
Lost Update is a concurrency problem where two transactions update the same data,
and one transaction's update overwrites the other, causing the earlier update to be
lost.