Lecture 31: Transactions/1 :
Serializability
Introduction: The Invisible Guardian of Your Data
Ever wonder why, when you transfer money online, the funds don't just vanish into
thin air if your internet connection drops midway? Or how millions of users can shop
on Amazon simultaneously without buying the same last item in stock? The answer
lies in transactions. A database doesn't just see a single click; it sees a sequence
of delicate operations that must either succeed completely or fail as if they never
happened. This lecture pulls back the curtain on this fundamental concept, exploring
how database systems manage multiple, simultaneous operations (concurrency)
without corrupting the data. We'll see why some concurrent operations work
perfectly, while others can lead to catastrophic inconsistencies.
Core Concepts
Term Definition Significance
The fundamental building
A single logical unit of work,
block for all database
composed of one or more
interactions. To the system,
Transaction database operations (reads,
transferring money is one
writes), that accesses and
transaction, not separate debit
possibly updates data items.
and credit operations.
These are the non-negotiable
A set of four properties that
rules for any robust
guarantee the reliability of
ACID transaction management
transaction processing:
Properties system. They ensure data
Atomicity, Consistency,
integrity even in the face of
Isolation, and Durability.
errors and concurrent access.
The "all or nothing" property. Prevents database corruption
A transaction must either from partial updates, like
Atomicity execute in its entirety or not money being debited from one
Term Definition Significance
at all. There is no partial account but never credited to
execution. another (07:49).
Ensures that data integrity
rules (e.g., A + B balance
A transaction must bring the
remains constant during a
Consistency database from one valid,
simple transfer) are not
consistent state to another.
violated by the end of the
transaction (09:12).
Concurrent transactions
Prevents one transaction from
should not interfere with each
reading the temporary,
other. The result of concurrent
Isolation inconsistent data of another,
execution should be
uncommitted transaction
equivalent to some serial
(11:25).
(one-by-one) execution.
Once a transaction has been
successfully committed, its
Guarantees that completed
changes are permanent and
Durability work is not lost. The "save" is
will survive any subsequent
final (12:31).
system failures (e.g., crashes,
power outages).
A sequence of instructions The "game plan" for the CPU.
that specifies the The way a schedule
Schedule chronological order in which interleaves instructions
instructions from concurrent determines whether the
transactions are executed. outcome is correct or corrupt.
This state diagram (14:30)
The lifecycle of a transaction,
models how a system handles
moving between states like
Transaction both successful completion
Active , Partially
States and failure. A failed
Committed , Committed ,
transaction is rolled back to
Failed , and Aborted .
maintain atomicity.
Detailed Analysis: The Perils and Promise of Concurrency
Let's analyze the lecture's core example to see how different execution orders
(schedules) produce wildly different results.
• Initial State: Account A = $100, Account B = $200. (Total = $300)
• Transaction T1: Transfer $50 from A to B.
• Transaction T2: Transfer 10% of A's balance to B.
1. Serial Schedules: The Safe (but Slow) Baseline
A serial schedule executes transactions one after another. The outcome is always
consistent, though the final values depend on the order.
• Schedule 1 (T1 → T2):
◦ T1 executes: A becomes $50, B becomes $250. (Total = $300).
◦ T2 executes: It reads A=$50. Transfers 10% ($5).
◦ Final State: A = $45, B = $255. (Total = $300). Correct.
• Schedule 2 (T2 → T1):
◦ T2 executes: It reads A=$100. Transfers 10% ($10). A becomes
$90, B becomes $210. (Total = $300).
◦ T1 executes: Transfers $50.
◦ Final State: A = $40, B = $260. (Total = $300). Also Correct.
Both results are valid because they represent a logical, sequential ordering of
events.
2. Concurrent Schedules: The Balance of Speed and Risk
Here, instructions are interleaved to improve throughput. This is where things get
interesting.
• Schedule 3 (A "Good" Interleaving): This schedule interleaves
operations but ultimately produces a correct result.
◦ T1: read(A) (A=100)
◦ T1: A = A - 50 (local A=50)
◦ T1: write(A) (DB A=50, B=200. Inconsistent state!)
◦ T2: read(A) (A=50)
◦ T2: temp = 0.1 * A (temp=5)
◦ T2: A = A - temp (local A=45)
◦ T2: write(A) (DB A=45, B=200. Still inconsistent!)
◦ T1: read(B) (B=200)
◦ T1: B = B + 50 (local B=250)
◦ T1: write(B) (DB A=45, B=250. Still inconsistent!)
◦ T1: commit
◦ T2: read(B) (B=250)
◦ T2: B = B + temp (local B=255)
◦ T2: write(B) (DB A=45, B=255)
◦ T2: commit
◦ Final State: A = $45, B = $255. (Total = $300). This result is
identical to Serial Schedule 1. This is a serializable schedule
and is considered correct.
• Schedule 4 (A "Bad" Interleaving): A slight change in the interleaving
leads to disaster. This demonstrates the Lost Update problem.
◦ T1: read(A) (local T1.A=100)
◦ T1: A = A - 50 (local T1.A=50)
◦ T2: read(A) (local T2.A=100, T1's update is not yet written)
◦ T2: temp = 0.1 * A (temp=10)
◦ T2: A = A - temp (local T2.A=90)
◦ T2: write(A) (DB A=90)
◦ T2: commit (T2 finishes based on the old value of A)
◦ T1: write(A) (DB A=50, overwriting T2's work!)
◦ T1: ... (T1 continues and adds $50 to B)
◦ Final State (as traced in lecture): A=$90, B=$260 (Total =
$350). INCORRECT! The database is permanently corrupted; $50
was effectively created out of thin air because T1's write(A)
overwrote T2's update, but T2's update to B was based on a stale
value.
Key Takeaways Checklist
• [x] Transactions are Atomic: They are the indivisible units of work.
• [x] ACID is the Law: Atomicity, Consistency, Isolation, and Durability are
the four commandments of reliable database operations.
• [x] Concurrency is Necessary: Executing transactions purely in sequence
is too slow for modern applications. We need to interleave them to improve
system throughput and user response time.
• [x] Uncontrolled Concurrency is Dangerous: As seen in Schedule 4,
interleaving instructions without proper control can violate isolation and lead
to an inconsistent database.
• [x] The Goal is Serializability: A concurrent schedule is only "correct" if it
produces a result equivalent to some serial execution of the same
transactions. The central challenge is to allow interleaving while
guaranteeing this property.
Conclusion: The Foundation of Trust
Understanding transactions and serializability is crucial because they form the
bedrock of trust in any data-driven system. Whether it's a bank, an e-commerce
platform, or an airline booking system, the ability to handle thousands of
simultaneous requests without corrupting data is what makes them functional. This
lecture has identified the core problem: how to get the performance benefits of
concurrency without sacrificing the correctness guaranteed by the ACID properties.
The following modules will explore the mechanisms and algorithms that database
systems use to enforce this "correct" concurrency.