0% found this document useful (0 votes)
10 views33 pages

Unit IV - Transaction Processing & Recovery

Unit IV covers transaction processing and recovery, detailing the concepts and properties of transactions, including ACID properties, transaction states, schedules, and serializability. It explains recovery mechanisms, log-based recovery techniques, checkpoints, and deadlock scenarios, emphasizing the importance of maintaining database consistency and integrity during operations. The document provides examples and real-life applications to illustrate each concept effectively.

Uploaded by

kushagra4004
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views33 pages

Unit IV - Transaction Processing & Recovery

Unit IV covers transaction processing and recovery, detailing the concepts and properties of transactions, including ACID properties, transaction states, schedules, and serializability. It explains recovery mechanisms, log-based recovery techniques, checkpoints, and deadlock scenarios, emphasizing the importance of maintaining database consistency and integrity during operations. The document provides examples and real-life applications to illustrate each concept effectively.

Uploaded by

kushagra4004
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

UNIT IV - TRANSACTION PROCESSING &

RECOVERY

TOPIC 1: TRANSACTION CONCEPTS & PROPERTIES

What is a Transaction?
A transaction is a sequence of database operations (read/write) that either all
succeed together or all fail together. It's an atomic unit of work.

Simple Explanation:​
Imagine transferring money from your bank account. The transaction has two steps:
1.​ Deduct from Account A
2.​ Add to Account B
Either BOTH happen, or NEITHER happens. If step 1 succeeds but step 2 fails
(system crash), your money would disappear. Transactions prevent this.

Real-Life Example:
●​ Online shopping: When you click "Buy Now," the transaction includes:
●​ Check inventory (read)
●​ Deduct from stock (write)
●​ Add to bill (write)
●​ Update your order status (write)
●​ If system crashes after inventory check but before bill update, the entire
transaction rolls back. Your item stays in stock, no charge.

ACID Properties of Transactions


What are ACID properties?

The four guarantees that every transaction must follow:

1. Atomicity (All or Nothing)

Simple: Either the entire transaction executes, or none of it executes.

Real Example:
●​ ATM withdrawal: Machine gives cash AND deducts from account, OR it does
nothing.
●​ If power cuts after cash dispensed but before deduction, the bank's system
rolls back (undoes) both operations.
2. Consistency (Data stays valid)

Simple: Database stays in a valid state before and after transaction. All
rules/constraints are maintained.

Real Example:
●​ Rule: Student can't have attendance > 100%
●​ Transaction tries to set attendance to 105%
●​ DBMS rejects entire transaction to maintain consistency.
3. Isolation (No interference)

Simple: While one transaction is running, other transactions don't see its incomplete
changes.

Real Example:
●​ You're checking your bank balance (Transaction A)
●​ Your salary is being deposited (Transaction B)
●​ You won't see the balance until salary deposit completely finishes
●​ Meanwhile, salary deposit doesn't see your balance-checking operations
4. Durability (Once done, it stays)

Simple: Once transaction commits (completes), data is permanently saved, even if


system crashes.

Real Example:
●​ You buy something online and see "Order Confirmed"
●​ Server crashes 1 second later
●​ Your order doesn't disappear. The data was written to hard disk and persists.

TOPIC 2: TRANSACTION STATES & STATE DIAGRAM

Transaction Lifecycle
A transaction goes through these states:

1. Active → 2. Partially Committed → 3. Committed (or Aborted)


Active State:
●​ Transaction is executing
●​ Read/write operations happening
●​ Changes are in memory (not yet saved to disk)
Partially Committed State:
●​ All operations finished
●​ Last write statement executed
●​ But data not yet saved to disk
●​ Still in memory
●​ Risk: System can still crash and data is lost
Committed State:
●​ Data written to permanent disk
●​ Transaction complete and successful
●​ All ACID properties satisfied
●​ Data is durable
Aborted/Failed State:
●​ Transaction rolled back (undone)
●​ All changes reversed
●​ Database returns to state before transaction started
●​ Can happen due to: user abort, system crash, constraint violation, deadlock
State Diagram (describe what to draw):
text

START → ACTIVE → PARTIALLY COMMITTED → COMMITTED (END)


↓ ↓
└──→ FAILED → ABORTED (END)

●​ ACTIVE to PARTIALLY COMMITTED: when last statement executes


●​ PARTIALLY COMMITTED to COMMITTED: when data written to disk (commit
operation)
●​ Any state to FAILED: if error occurs (constraint violation, deadlock, crash)
●​ FAILED to ABORTED: when rollback happens

TOPIC 3: SCHEDULES & SERIALIZABILITY

What is a Schedule?
A schedule is the order in which multiple transactions' operations are executed in a
DBMS.

Simple Explanation:

Imagine 3 people withdrawing from ATMs simultaneously:


●​ Person A: Check balance, withdraw $100
●​ Person B: Check balance, withdraw $50
●​ Person C: Check balance, withdraw $200
The schedule is the actual order these operations happen (not necessarily in this
sequence):
●​ Check balance (A)
●​ Withdraw $100 (A)
●​ Check balance (B)
●​ Check balance (C)
●​ Withdraw $50 (B)
●​ Withdraw $200 (C)
This is one possible schedule. Different orderings = different schedules.

Real Example:
●​ Bank has balance = $1000
●​ Transaction A: Withdraw $100 (read balance → write balance-100)
●​ Transaction B: Withdraw $100 (read balance → write balance-100)
Schedule 1 (Serial):
●​ A reads balance ($1000)
●​ A writes balance ($900)
●​ B reads balance ($900)
●​ B writes balance ($800)
●​ Result: Correct ✓
Schedule 2 (Concurrent - Wrong):
●​ A reads balance ($1000)
●​ B reads balance ($1000) ← Both read same initial value
●​ A writes balance ($900)
●​ B writes balance ($900) ← Both write same wrong value
●​ Result: Lost $100 ✗
This is why we need serializability — to ensure concurrent schedules give same
result as serial execution.
What is a Serial Schedule?
A schedule where one transaction completes fully before another starts.

Example:
●​ T1 completes entirely
●​ Then T2 completes entirely
●​ Then T3 completes entirely
Good: Always correct (maintains consistency)​
Bad: Slow (no parallelism)

What is Serializable Schedule?


A concurrent schedule that produces the same result as some serial schedule.

Simple Explanation:

You can run T1, T2, T3 simultaneously, but the final result should be as if you ran
them one-by-one in some order.

Real Example:

Two concurrent transactions:


●​ T1: Read Account A → Add $100 to Account B (transfer from A to B)
●​ T2: Read Account B → Calculate interest
Serializable Schedule:
●​ Even though T1 and T2 run at same time, the result should be as if either:
●​ T1 fully runs, then T2 (Serial order 1), OR
●​ T2 fully runs, then T1 (Serial order 2)

If executing them simultaneously gives neither result → NOT serializable → WRONG!

Two Types of Serializability


1. Conflict Serializable

Two operations conflict if they:


●​ Belong to different transactions
●​ Access same data item
●​ At least one is a write operation
Example conflicts:
●​ T1 writes X, T2 reads X → CONFLICT
●​ T1 reads X, T2 writes X → CONFLICT
●​ T1 writes X, T2 writes X → CONFLICT
●​ T1 reads X, T2 reads X → NO CONFLICT (both reading)
Conflict Serializable = A concurrent schedule where we can rearrange non-conflicting
operations to make it equal to a serial schedule.

How to check:
1.​ List all conflicting pairs
2.​ Draw a precedence graph (directed graph showing "must come before"
relationships)
3.​ If graph is acyclic → Conflict serializable ✓
4.​ If graph has cycle → NOT conflict serializable ✗
Real Example:
●​ T1: Write X, Read Y
●​ T2: Read X, Write Y
Conflicts: (W1(X), R2(X)) and (R1(Y), W2(Y))

Graph edges: T1 → T2 (due to first conflict), T2 → T1 (due to second conflict)

CYCLE DETECTED → NOT conflict serializable → Schedule is WRONG

2. View Serializable

A weaker condition than conflict serializability.

Simple Explanation:

Two schedules are view equivalent if:


●​ Same initial reads produce same values
●​ Blind writes (overwriting without reading) are in same order
Real Use: More schedules are considered valid, but harder to check.

Important: Every conflict serializable schedule is view serializable, but not vice versa.

TOPIC 4: RECOVERABILITY CONCEPTS


What is Recovery?
When a transaction fails or system crashes, recovery means bringing database back
to a consistent state.

Real Example:
●​ Bank transfers $500 from A to B
●​ Deduct from A: SUCCESS
●​ Add to B: SYSTEM CRASHES
Recovery must:
●​ If B addition was never saved → undo deduction from A too
●​ Database should act as if transaction never started

Three Types of Recoverability


1. Recoverable Schedule

A schedule where if transaction T2 reads value written by T1, then T1 must commit
before T2 commits.

Why? If T2 commits first, and then T1 fails, T2's data becomes inconsistent (it used
T1's values which are now rolled back).

Real Example:
●​ T1: Write X = 100
●​ T2: Read X (gets 100), Write Y = X + 50 (Y = 150)
●​ T1: Commit
●​ T2: Commit ✓ Recoverable
Wrong way:
●​ T1: Write X = 100
●​ T2: Read X, Write Y = 150
●​ T2: Commit
●​ T1: Abort (Rollback) ✗ T2 is now inconsistent (its Y is based on aborted T1's
value)

2. Cascadeless Schedule

A schedule where T2 doesn't read values written by T1 that haven't been committed
yet.
Prevents "cascading rollback" (when T1 aborts, T2 also has to abort, which makes
T3 abort, etc.)

Real Example - Good (Cascadeless):


●​ T1: Write X, Commit
●​ T2: Read X (now X is committed, safe)
Real Example - Bad (Cascading):
●​ T1: Write X (not committed)
●​ T2: Read X (reads uncommitted value)
●​ T1: Abort
●​ T2 must also Abort (cascades)
●​ If T3 read T2's values, T3 must also Abort

3. Strict Schedule

The strictest: T2 cannot read or write X until T1 (which wrote X) commits or aborts.

Most Conservative but Safest

Real Example:
●​ T1: Write X, Commit
●​ T2: Read X (safe)
●​ OR
●​ T1: Write X, Abort
●​ T2: Read X (T1's write is undone, T2 reads fresh value, safe)

Relationship:
Strict ⊂ Cascadeless ⊂ Recoverable

Every strict schedule is cascadeless.​


Every cascadeless schedule is recoverable.​
But not all recoverable are cascadeless, etc.

TOPIC 5: LOG-BASED RECOVERY

What is a Log?
A log is a file that records every database operation before it's executed.
Simple Explanation:

Like a diary of database changes:


●​ Who did what
●​ When
●​ What values changed
●​ Which transaction caused it
Example log entries:
text

<T1, START> (T1 started)


<T1, X, 100, 200> (T1 changed X from 100 to 200)
<T1, COMMIT> (T1 committed)

Real Example:

Online shopping purchase:


●​ Log: "T1 started"
●​ Log: "T1 reduced inventory item #5 from 50 to 49"
●​ Log: "T1 added $99.99 to bill"
●​ Log: "System crashed before T1 committed"
On recovery:
●​ System checks log
●​ Sees T1 didn't commit → rolls back all T1 operations
●​ Inventory goes back to 50, bill cancels

Two Recovery Techniques


1. Deferred Database Modification
●​ During transaction: Don't write to database. Only log the changes.
●​ On commit: Write all changes to database.
●​ On abort/crash: Just delete log; database unchanged.
Advantage: Easy recovery (just ignore log)​
Disadvantage: Need large memory to hold all changes

Real Example:​
Batch processing (e.g., end-of-day billing):
●​ Collect all changes in memory
●​ If all validations pass → write everything
●​ If any validation fails → discard all changes (nothing was written anyway)

2. Immediate Database Modification


●​ During transaction: Write changes to database immediately.
●​ Log before each write: Write log entry first, THEN modify database.
●​ On commit: Mark in log as committed.
●​ On abort/crash: Read log backwards, undo all operations.
Advantage: Faster (changes visible immediately)​
Disadvantage: Undo operation needed on crash

Real Example:​
Banking ATM withdrawal:
●​ Log: "Deduct $100"
●​ Actually deduct $100 from account (immediately)
●​ Log: "Committed"
●​ If crash before "committed" log entry → undo (add $100 back)

TOPIC 6: CHECKPOINTS

What is a Checkpoint?
A checkpoint is a marker/snapshot that says "all transactions up to this point are
safely on disk."

Simple Explanation:

Like saving a game while playing:


●​ Before checkpoint: If crash, lose all progress
●​ After checkpoint: If crash, only lose progress AFTER checkpoint
Real Example:

During a bank's night batch processing:


●​ 10 PM: Checkpoint 1 (all day's transactions saved to disk)
●​ 11 PM: More transactions
●​ 12 AM: Checkpoint 2 (day's end saved)
●​ 1 AM: More overnight transactions
●​ 2 AM: System crashes
On recovery:
●​ Load from Checkpoint 2 (12 AM)
●​ Only redo transactions from 12 AM onwards
●​ Don't redo all day's transactions (already on disk at checkpoint)
Benefit: Recovery is faster; less redo work.

TOPIC 7: DEADLOCK

What is a Deadlock?
A situation where two or more transactions are waiting for each other, so none can
proceed.

Simple Explanation:

Two people in a hallway:


●​ Person A: Holding item 1, wants item 2
●​ Person B: Holding item 2, wants item 1
Both are waiting for the other to release → DEADLOCK → Both stuck forever.

Real Example:
●​ Transaction T1: Locks Account A, wants to lock Account B
●​ Transaction T2: Locks Account B, wants to lock Account A
text

T1: Lock A → tries Lock B → WAIT (B locked by T2)


T2: Lock B → tries Lock A → WAIT (A locked by T1)

Neither can proceed. DEADLOCK!

Four Necessary Conditions for Deadlock


All FOUR must be true for deadlock to occur:
1.​ Mutual Exclusion: Resources can only be held by one transaction at a time.
2.​ Hold and Wait: Transaction holds resource while waiting for another.
3.​ No Preemption: Can't forcefully take resource from another transaction.
4.​ Circular Wait: Chain of transactions waiting in a circle.
To prevent deadlock: Remove any ONE of these conditions.

Deadlock Detection & Recovery


Detection:

Create a Wait-For Graph:


●​ Nodes = transactions
●​ Edge from T1 → T2: T1 waiting for resource held by T2
If cycle exists → DEADLOCK

Recovery (Break deadlock by killing one transaction):


1.​ Choose victim: Which transaction to kill (usually smallest/cheapest to redo)
2.​ Rollback: Undo victim transaction (release its locks)
3.​ Restart: Victim can restart later
Real Example:

Two ATM transfers deadlocked:


●​ System detects: T1 → T2 → T1 (cycle)
●​ System kills T2 (maybe fewer writes than T1)
●​ T2 rolls back, releases Account B
●​ T1 proceeds, locks B, completes
●​ T2 retries

NOW: ALL UNIT IV PYQS WITH ANSWERS

2-MARK QUESTIONS (Section A)

Q 2023-24(g): "What do you mean by testing of


serializability?"
ANSWER:

Testing serializability means checking if a concurrent schedule produces the same


result as a serial schedule.
Method: Draw precedence graph of conflicting operations. If graph is acyclic →
serializable. If cyclic → not serializable.

Q 2024-25(f): "Explain properties of Transaction."


ANSWER:

Atomicity: All or nothing; entire transaction executes or none executes.

Consistency: Database remains in valid state; all constraints satisfied.

Isolation: One transaction's incomplete changes not visible to others.

Durability: Once committed, data persists permanently, even after crashes.

Q 2023-24(i): "Define concurrency control."


ANSWER:

Concurrency control is managing multiple transactions running simultaneously to


ensure they don't interfere with each other and maintain data consistency.

Techniques: Locking (2PL), Timestamp ordering, Optimistic control.

Q 2023-24(j): "Define exclusive lock."


ANSWER:

An exclusive lock (write lock) allows only the transaction holding it to read or write
the data item.

No other transaction can read or write until lock is released.

Used for write operations.

10-MARK QUESTIONS (Section B)


Q 2022-23 (2d): "What is a schedule? Define recoverable,
cascadeless, and strict schedules, and compare them."
ANSWER:

Schedule Definition:

A schedule is the order in which operations of multiple transactions are executed in a


database.

Example:
●​ T1: Read X, Write X
●​ T2: Read X, Write Y
Schedule: R1(X), R2(X), W1(X), W2(Y)

Three Types of Schedules:

1. Recoverable Schedule:

Definition: If T2 reads a value written by T1, then T1 must commit before T2


commits.

Why Important: Ensures no transaction commits based on uncommitted data.

Example (Recoverable):
●​ T1: Write X = 100
●​ T2: Read X (gets 100)
●​ T1: Commit ✓ (before T2 commits)
●​ T2: Commit ✓
Example (Not Recoverable):
●​ T1: Write X = 100
●​ T2: Read X (gets 100)
●​ T2: Commit ✓ (before T1 commits)
●​ T1: Abort ✗ (T2 committed based on aborted data; now inconsistent)

2. Cascadeless Schedule:

Definition: T2 cannot read values written by T1 that haven't been committed yet.

Prevents cascading rollbacks (when one abort causes many aborts).

Example (Cascadeless):
●​ T1: Write X = 100
●​ T1: Commit ✓
●​ T2: Read X ✓ (reads committed value; safe)
Example (Not Cascadeless):
●​ T1: Write X = 100 (not committed)
●​ T2: Read X (reads uncommitted value)
●​ T1: Abort
●​ T2: Must abort too (cascading) ✗

3. Strict Schedule:

Definition: T2 cannot read or write X until T1 (which wrote X) commits or aborts.

Most conservative; safest.

Example (Strict):
●​ T1: Write X = 100
●​ T1: Commit
●​ T2: Read X ✓ (T1 committed, safe to read)

Comparison Table:

Aspect Recoverable Cascadeless Strict

T2 can't read T2 can't read/write


T1 must commit before
Definition uncommitted values until T1 commits or
T2 if T2 reads T1's data
from T1 aborts

Safety Medium High Highest

Cascading
Possible Not possible Not possible
Rollback
Allows dirty reads if Avoids dirty reads Avoids dirty reads and
Example
followed by commit during transaction dirty writes

Overhead Low Medium High

Relationship:​
Strict ⊂ Cascadeless ⊂ Recoverable

Every strict schedule is cascadeless.​


Every cascadeless schedule is recoverable.

Q 2022-23 (2e): "Discuss immediate update recovery


technique in single-user and multiuser environments.
Advantages and disadvantages?"
ANSWER:

Immediate Update Definition:

Changes are written to database immediately during transaction execution (not after
commit). A log is maintained for recovery.

In Single-User Environment:

How it works:
1.​ Before writing to database → Write log entry
2.​ Execute write operation
3.​ On commit → Mark log as committed
4.​ On crash before commit → Read log backward, undo writes
Advantages:
●​ Changes visible immediately
●​ Good for long-running transactions
●​ Locks can be released early
Disadvantages:
●​ Must maintain undo log (extra I/O)
●​ Undo operation needed if crash
●​ More complex recovery

In Multiuser Environment:

How it works:
1.​ T1 writes to database immediately (logged)
2.​ T2 might read T1's uncommitted data (dirty read)
3.​ If T1 aborts, T2's data becomes inconsistent
4.​ On crash → Undo T1's writes, but also undo T2's writes (cascading undo)
Advantages:
●​ Parallelism (multiple users see updates quickly)
●​ Good response time
Disadvantages:
●​ Dirty reads possible (T2 reads T1's uncommitted data)
●​ Cascading rollbacks (if T1 aborts, T2 must abort too)
●​ Complex undo/redo logic for multiple transactions
●​ More undo operations during recovery

To Avoid Cascading Rollbacks in Multiuser:

Use Strict Immediate Update or Cascadeless Immediate Update:


●​ Uncommitted writes not visible to others
●​ Only visible after commit

Q 2023-24 (6b): "What is Log? How is it maintained?


Discuss deferred and immediate database modification."
ANSWER:

Log Definition:

A log is a file/record that stores all database operations (read, write) along with
transaction info before operations are executed on database.

Purpose: Enables recovery by knowing what operations were done and in what order.

Log Entry Format:


text
<Transaction ID, Operation, Data Item, Old Value, New Value>

Example: <T1, WRITE, X, 100, 200>


Means: Transaction T1 wrote to data item X, changed from 100
to 200

How Log is Maintained:


1.​ Append-only: New entries always added at end (never modified)
2.​ Sequential write: Written to disk before actual database write (for safety)
3.​ Persistent: Stored on stable storage (disk), survives crashes
4.​ Timestamp: Each entry has timestamp for ordering
5.​ Transaction markers: <T, START>, <T, COMMIT>, <T, ABORT>
Example Log:
text

<T1, START>
<T1, WRITE, Account_A, 1000, 900>
<T1, WRITE, Account_B, 2000, 2100>
<T1, COMMIT>
<T2, START>
<T2, READ, Account_A>
(CRASH - T2 not committed)

On recovery: Undo T2 (didn't commit), keep T1 (committed).

Deferred Database Modification:

How it works:
●​ During transaction: Only log the changes, don't write to database
●​ On commit: Write all changes to database
●​ On abort/crash: Ignore log, database unchanged
Redo Journal:
text

<T, Write, Item, New_Value>


<T, COMMIT> (After this, apply all writes)
Advantages:
●​ Easy recovery (just ignore log if not committed)
●​ No undo needed
●​ No dirty reads (uncommitted data not in database)
Disadvantages:
●​ Need large memory to buffer all changes
●​ Can't see changes until commit
●​ Slow for large transactions
Real Example:
text

ATM Withdrawal of $100:


- Log: Write(Balance, 900)
- Database: Still shows 1000 (not changed)
- User presses confirm
- Log: COMMIT
- Database: Now writes 900
- If crash before commit: Database still has 1000 (safe)

Immediate Database Modification:

How it works:
●​ During transaction: Write to database immediately
●​ Before each write: Log entry written first, then database write
●​ On commit: Mark in log as committed
●​ On abort/crash: Read log backward, undo writes
Undo Journal:
text

<T, WRITE, Item, Old_Value, New_Value>


<T, COMMIT> (Only after log; changes already in DB)

Advantages:
●​ Changes visible immediately
●​ Good for interactive systems
●​ Faster for short transactions
Disadvantages:
●​ Undo needed on crash (read log backward, restore old values)
●​ Dirty reads possible (uncommitted data readable)
●​ Cascading rollback if not careful
●​ More complex recovery logic
Real Example:
text

ATM Withdrawal of $100:


- Log: Write(Balance, 1000 → 900) (Write before change)
- Database: Changes to 900 (Immediate)
- User sees 900
- User presses confirm
- Log: COMMIT
- System crash after this: On recovery, log shows committed,
so 900 is kept
- If crash before COMMIT: Log shows incomplete, undo to 1000

Comparison:

Aspect Deferred Immediate

When written to DB On commit only During transaction

Visibility After commit only Immediately

Recovery Redo only Undo + Redo


Dirty reads Not possible Possible

Memory usage High (buffer all changes) Low

Response time Slow (wait for commit) Fast (immediate)

Cascading rollback Not possible Possible (if strict not enforced)

10-MARK QUESTIONS (Section C)

Q 2023-24 (6a): "What is conflict serializable schedule?


Explain difference between conflict and view serializable
with example."
ANSWER:

Conflict Serializable Schedule:

A concurrent schedule that can be rearranged into a serial schedule by moving


non-conflicting operations.

Definition of Conflict:

Two operations conflict if:


●​ They belong to different transactions
●​ They access the same data item
●​ At least one is a write
Conflict Pairs:
●​ R(X), W(X) → CONFLICT
●​ W(X), R(X) → CONFLICT
●​ W(X), W(X) → CONFLICT
●​ R(X), R(X) → NO CONFLICT
How to Check Conflict Serializability:

Method: Draw Precedence Graph


●​ Nodes = transactions
●​ Edge T1 → T2 if operation of T1 conflicts with operation of T2 and comes first
Rule: If graph is acyclic → conflict serializable ✓​
If graph has cycle → not conflict serializable ✗

Example 1 - Conflict Serializable:


text

T1: Read X
T2: Write X
T3: Write Y
T1: Write Y

Schedule: R1(X), W2(X), W3(Y), W1(Y)

Conflicts:
●​ R1(X) and W2(X): T1 → T2
●​ W3(Y) and W1(Y): T3 → T1
Precedence Graph: T3 → T1 → T2 (no cycle)

Equivalent Serial Schedule: T3, T1, T2 ✓ Conflict Serializable

Example 2 - Not Conflict Serializable:


text

T1: Write X, Read Y


T2: Read X, Write Y

Schedule: W1(X), R2(X), R1(Y), W2(Y)

Conflicts:
●​ W1(X) and R2(X): T1 → T2
●​ R1(Y) and W2(Y): T2 → T1
Precedence Graph: T1 → T2 → T1 (CYCLE!)

Not Conflict Serializable ✗

View Serializable Schedule:

Definition: A schedule where transactions see the same data values as in some
serial schedule.

Weaker than conflict serializability (allows more schedules).

Two Conditions for View Equivalence:


1.​ Initial Read: If T2 reads initial value of X in schedule S, then T2 must read
initial value in serial schedule too.
2.​ Read from: If T2 reads value written by T1 in schedule S, then T2 must read
same value in serial schedule.
3.​ Final Write: If T1 is final writer of X in schedule S, then T1 must be final writer
in serial schedule.

Example - View Serializable but NOT Conflict Serializable:


text

T1: Read X
T2: Write X
T3: Write X

Schedule: R1(X), W2(X), W3(X)

Conflict Analysis:
●​ R1(X), W2(X): T1 → T2
●​ W2(X), W3(X): T2 → T3
Precedence: T1 → T2 → T3 (acyclic, so conflict serializable)

Difference Table:

Aspect Conflict Serializable View Serializable


Definitio Same data values as serial
Can rearrange non-conflicts to make serial
n schedule

Checkin
Precedence graph (easier) Check view conditions (harder)
g

Strictnes
More strict More lenient
s

Coverag
Subset of view serializability Superset of conflict serializability
e

Practical Usually preferred (checkable) Theoretical

Relationship: Conflict Serializable ⊂ View Serializable

Every conflict serializable is view serializable, but not vice versa.

Q 2022-23 (2d) SCHEDULE ANALYSIS: "Determine if


schedules S1, S2, S3 are strict, cascadeless, recoverable,
or non-recoverable."
ANSWER:

Given:
●​ S1: r1(X), r2(Z), r1(Z), r3(X), r3(Y), w1(X), c1, w3(Y), c3, r2(Y), w2(Z), w2(Y), c2
●​ S2: r1(X), r2(Z), r1(Z), r3(X), r3(Y), w1(X), w3(Y), r2(Y), w2(Z), w2(Y), c1, c2, c3
●​ S3: r1(X), r2(Z), r3(X), r1(Z), r2(Y), r3(Y), w1(X), c1, w2(Z), w3(Y), w2(Y), c3, c2

Analysis:
S1:

Check dependencies:
●​ T2 reads Z (r2(Z)) → T3 writes Z (no)
●​ T1 writes X (w1(X)) → T2 reads X (no)
●​ T3 writes Y (w3(Y)) → T2 reads Y (r2(Y)) ✓ (T3 commits before T2, so
recoverable)
●​ T2 writes Z (w2(Z)) → T3 reads Z (no)
Verdict:
●​ Recoverable: YES (T3 commits before T2 reads Y) ✓
●​ Cascadeless: YES (no dirty reads) ✓
●​ Strict: YES (no reads/writes of uncommitted data) ✓
Answer: STRICT (most restrictive condition)

S2:

Check order:
●​ T1 writes X (w1(X)), T2 reads nothing from T1
●​ T3 writes Y (w3(Y)), T2 reads Y (r2(Y)) ✓ BUT T3 commits at END (c3), not
before T2 reads
●​ T2 reads Y (r2(Y)) BEFORE T3 commits ✗ (dirty read)
Verdict:
●​ Recoverable: YES (commits in order after all reads)
●​ Cascadeless: NO (T2 reads uncommitted data from T3) ✗
●​ Strict: NO ✗
Answer: RECOVERABLE

S3:

Check order:
●​ T2 reads Y (r2(Y)) before anyone writes it → initial read (OK)
●​ T1 writes X (w1(X)), T3 reads X → No, T3 doesn't read X after w1(X)
●​ T2 writes Z (w2(Z)), T3 writes Y (w3(Y)) - no conflict
●​ T2 writes Y (w2(Y)), but T3 already wrote Y → Both write Y
Verdict:
●​ No transaction reads uncommitted data
●​ Recoverable: YES ✓
●​ Cascadeless: YES ✓
●​ Strict: Requires T2 not read uncommitted. Checked, OK. YES ✓
Answer: STRICT

Q 2024-25 (6a): "Illustrate Conflict Serializable Schedule.


Check if S1 is conflict serializable AND view serializable.
S1: R1(X), R2(X), R2(Y), W2(Y), R1(Y), W1(X)"
ANSWER:

Conflict Serializability Check:

Step 1: Identify Conflicts


●​ R1(X), R2(X): NO CONFLICT (both reads)
●​ R2(X), W1(X): CONFLICT ← T2 before T1 → T2 → T1
●​ R2(Y), W1(X): NO CONFLICT (different items)
●​ W2(Y), R1(Y): CONFLICT ← T2 before T1 → T2 → T1
●​ W2(Y), W1(X): NO CONFLICT (different items)
●​ R1(Y), W2(Y): CONFLICT (inverse) ← T2 before T1 → T2 → T1
Step 2: Draw Precedence Graph
text

T2 → T1

Edges:
●​ T2 → T1 (from R2(X), W1(X))
●​ T2 → T1 (from W2(Y), R1(Y))
Acyclic Graph ✓

Equivalent Serial Schedule: T2, T1

Verification: S1 serial equivalent is T2, T1


text

Serial T2, T1:


R2(X) ✓
R2(Y) ✓
W2(Y) ✓
R1(X) ✓ (initial read of X)
R1(Y) ✓ (reads Y=new value from T2)
W1(X) ✓

Matches S1 order ✓

Answer: S1 is CONFLICT SERIALIZABLE ✓

View Serializability Check:

Conditions to check:
1.​ Initial Reads: Who reads before anyone writes?
●​ R1(X): reads initial value ✓
●​ R2(X): reads initial value ✓
●​ R2(Y): reads initial value ✓
2.​ Reads-From:
●​ R1(Y) reads Y written by W2(Y) ✓ (T2 writes, T1 reads)
3.​ Final Write:
●​ W1(X) is final write of X ✓
●​ W2(Y) is final write of Y ✓

Check against serial schedule T2, T1:


●​ T2 writes Y first → T1 reads Y (same in both) ✓
●​ T1 final writes X (same in both) ✓
●​ Initial reads satisfied (same in both) ✓
Answer: S1 is VIEW SERIALIZABLE ✓

Conclusion:

S1: Conflict Serializable ✓ AND View Serializable ✓

(Since conflict serializable is subset of view serializable, if conflict serializable, must


be view serializable)
Q 2024-25 (2d): "Determine different types of failures in
transactions. How to recover using log file? Explain with
example."
ANSWER:

Transaction Failures - Types:

1. Transaction Failure (Logical Error)

Transaction cannot complete normally due to:


●​ Arithmetic error (divide by zero)
●​ Data validation failure (age > 150)
●​ Constraint violation (unique key duplicate)
●​ Deadlock (waiting for resource forever)
Recovery: Rollback transaction using log

2. System Failure (Crash)

System stops abruptly due to:


●​ Power failure
●​ Software bug
●​ Memory fault
Effect: All transactions in memory lost; data on disk intact

Recovery: Redo committed transactions, undo uncommitted

3. Media Failure (Disk Crash)

Hard disk fails; data permanently lost.

Recovery: Restore from backup, then replay logs to get latest state

Recovery Using Log File - Step by Step:

Example Scenario:
text

Initial State: Account A = 1000, Account B = 2000


Log:
<T1, START>
<T1, WRITE, A, 1000, 900> (T1 deducts 100 from A)
<T1, WRITE, B, 2000, 2100> (T1 adds 100 to B)
<T1, COMMIT>
<T2, START>
<T2, WRITE, A, 900, 800> (T2 deducts another 100)
[SYSTEM CRASHES - T2 not committed]

Recovery Algorithm:

Step 1: Identify transactions


●​ Committed: T1 (has COMMIT in log)
●​ Uncommitted: T2 (no COMMIT marker)
Step 2: Undo uncommitted transactions (backward)
text

T2: WRITE, A, 900, 800


→ Reverse it: Set A = 900 (undo the 800)
T2: START → No undo needed for START

After undo:
●​ A = 900 (undone T2's change)
●​ B = 2100 (T1 kept because committed)
Step 3: Redo committed transactions (forward)
text

T1: WRITE, A, 1000, 900 → Set A = 900 ✓


T1: WRITE, B, 2000, 2100 → Set B = 2100 ✓
T1: COMMIT ✓

Final state: A = 900, B = 2100 ✓ (Correct; T1 completed, T2 rolled back)

Recovery with Checkpoint:


If log has checkpoint:
text

Log:
<T1, ...operations...>
<T1, COMMIT>
[CHECKPOINT] ← Snapshot: all committed transactions on disk
<T2, WRITE, ...>
[CRASH]

On Recovery:
●​ Skip all operations before CHECKPOINT (already on disk)
●​ Undo operations AFTER checkpoint
●​ Redo only operations after checkpoint
Benefit: Faster recovery (don't redo T1; already on disk)

Q 2024-25 (2e): "Discuss Concurrency Control. Why


needed? Explain timestamp-based ordering."
ANSWER:

Concurrency Control Definition:

Managing multiple transactions running simultaneously to ensure:


●​ Data consistency
●​ Isolation (no interference between transactions)
●​ No lost updates or dirty reads
Why Needed:

Problem 1: Lost Update


text

Initial: X = 100

T1: Read X (100)


T2: Read X (100)
T1: Write X = 150
T2: Write X = 200

Result: X = 200, T1's update lost!

Problem 2: Dirty Read


text

T1: Write X = 200 (uncommitted)


T2: Read X (gets 200)
T1: Abort (rollback)
T2's data now based on aborted transaction

Problem 3: Inconsistent Read


text

T1 transfers $100 from A to B:


- Read A (1000)
T2: Read A (1000), Read B (2000)
- T1: Write A = 900
- T1: Write B = 2100, Commit
T2: Read B (2100)
T2 sees A=1000, B=2100 (inconsistent; total changed)

Concurrency Control solves these by ensuring Serializability (concurrent = serial


execution result).

Timestamp-Based Ordering:

Concept:

Assign each transaction a timestamp (TS) based on order of arrival. Operations


must follow timestamp order.

Rule:
●​ If T1 arrives before T2 → TS(T1) < TS(T2)
●​ T1's operations must execute before T2's conflicting operations
How it Works:

Two Timestamps per Data Item X:


●​ Read-TS(X): TS of last transaction that read X
●​ Write-TS(X): TS of last transaction that wrote X
When T executes operation on X:

1. Read Operation (Read Constraint)

If TS(T) < Write-TS(X):


●​ T is trying to read value already overwritten by newer transaction
●​ CONFLICT → Rollback T, restart with new TS
Else:
●​ Read is allowed
●​ Update Read-TS(X) = max(Read-TS(X), TS(T))
2. Write Operation (Write Constraint)

If TS(T) < Read-TS(X):


●​ T wants to write, but newer transaction already read current value
●​ Write would make that read invalid
●​ CONFLICT → Rollback T
Else if TS(T) < Write-TS(X):
●​ T wants to write, but newer transaction already wrote (overwrite T's write)
●​ T's write is obsolete; ignore it (don't overwrite)
Else:
●​ Write is allowed
●​ Update Write-TS(X) = TS(T)

Example:
text

Initial: X = 100
TS(X) = 0, Write-TS(X) = 0

T1: TS=1, Write X = 200


→ TS(1) > Write-TS(0)? YES → Write allowed
→ Write-TS(X) = 1
T2: TS=3, Read X
→ TS(3) < Write-TS(1)? NO → Read allowed
→ Reads 200 (T1's value) ✓

T3: TS=2, Write X = 300


→ TS(2) < Write-TS(1)? NO (2 > 1)
→ TS(2) < Read-TS(1)? Check Read-TS(1)
→ Actually, TS(2) between TS(1) and TS(3)
→ TS(2) < Write-TS(X)=1? NO
→ BUT T3 < TS(T2) in real time: younger transaction trying
to write
→ TS(2) > Write-TS(1), so write allowed
→ Write-TS(X) = 2

Advantages:
●​ Simple, no deadlock (timestamp order prevents cycles)
●​ Good for read-heavy workloads
Disadvantages:
●​ Cascading aborts (if T rolls back, dependent T's roll back too)
●​ High restart overhead
●​ Not optimal for write-heavy loads

Thomas Write Rule (Optimization):

If T wants to write X but TS(T) < Write-TS(X):


●​ Instead of aborting, ignore the write (it's obsolete anyway)
●​ Newer transaction will overwrite it
●​ Avoid unnecessary rollback

Common questions

Powered by AI

Conflict serializable schedules can be rearranged into a serial order by moving non-conflicting operations, checking is done using a precedence graph. If the graph is acyclic, the schedule is conflict serializable . View serializable schedules are broader; they ensure transactions see the same data as in some serial schedule without necessarily rearranging operations . Check for view serializability includes verifying initial reads, reads-from, and final writes. Although every conflict serializable schedule is view serializable, the converse is not true .

A log entry typically contains the transaction ID, the operation performed, data items involved, old and new values, and transaction markers such as start, commit, and abort flags . These components allow the system to trace transaction actions, facilitating undo operations by reverting to old values or completing uncommitted changes during recovery . Timestamps help maintain operation order, vital for reapplying or undoing changes accurately .

A log file is a sequential record that captures each operation, detailing transaction IDs, operations, data item changes, old and new values. For example, "<T1, WRITE, X, 100, 200>" indicates Transaction T1 changed X from 100 to 200 . During recovery, the log helps reconstruct the database state by applying committed transactions and undoing uncommitted ones, as seen in cases of uncommitted transactions being recognized and rolled back during system crashes .

In deferred database modification, changes are only logged and applied to the database upon commit, hence recovery involves simply ignoring uncommitted changes as they aren't in the database yet . Conversely, immediate modification writes changes directly to the database during the transaction, requiring an undo operation in the event of a crash to revert any changes from non-committed transactions . Deferred modifications are safer in terms of cleanliness but more memory-intensive, while immediate provides quicker visibility at the expense of recovery complexity .

A strict schedule disallows any transaction from reading or writing a data item that a transaction has written until the transaction is committed or aborted, offering the safest scenario . A cascadeless schedule prevents cascading rollbacks by ensuring that transactions read only those data items which have been committed, avoiding cascading aborts . A recoverable schedule requires that if a transaction reads from another transaction, the latter must commit before the former, ensuring consistent data upon transaction aborts .

A precedence graph is used to map out the conflicts between transactions in a schedule, with nodes representing transactions and directed edges representing dependency conflicts that dictate the order transactions must occur . If the graph remains acyclic, the schedule is conflict serializable, meaning it can be rearranged to adhere to a serial sequence without affecting outcomes . If a cycle is present, it signifies irreconcilable conflicts, prohibiting serializability .

Transactions must be checked for recoverability to ensure that any transaction reading the results of another does so only if the predecessor has committed. This prevents a scenario where a committed transaction depends on an uncommitted one's changes, which could be lost if the latter fails, leading to inconsistent system states . During a crash, maintaining recoverability ensures that rolled-back transactions do not leave dependent transactions with corrupt or outdated data .

Strict schedules are a subset of cascadeless schedules, which are in turn a subset of recoverable schedules. In a strict schedule, transactions cannot see uncommitted data, ensuring no dirty reads or writes. Cascadeless allows reading committed data only, avoiding cascading aborts . Recoverable schedules ensure that no operation reads another transaction's writes before it commits, ensuring data consistency if rollbacks occur .

The immediate update allows for changes to be visible immediately, which benefits long-running transactions by freeing locks early and improving response times . However, in a multiuser environment, it presents challenges such as possible dirty reads where uncommitted data can be accessed, and the requirement for cascading rollbacks if a transaction aborts, complicating recovery logic .

A cascaded rollback occurs when a transaction aborts, causing subsequent transactions that have read its uncommitted changes to also abort, potentially affecting multiple subsequent transactions . A cascadeless schedule, on the other hand, prevents such dependencies, ensuring transactions read only committed data and thus avoiding the chain reaction of rollbacks seen in cascaded rollbacks .

You might also like