0% found this document useful (0 votes)

2 views17 pages

Distributed Computing Series 2

The document discusses performance metrics for distributed mutual exclusion algorithms, comparing token-based and non-token-based approaches, and highlights the challenges in designing distributed shared memory (DSM) systems. It also contrasts coordinated and uncoordinated checkpointing methods, and examines quorum-based algorithms, particularly Maekawa's and Lamport's algorithms for mutual exclusion. Key points include message complexity, synchronization delays, and the mechanisms for ensuring mutual exclusion and conflict resolution.

Uploaded by

lulu.luloos4530

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views17 pages

Distributed Computing Series 2

Uploaded by

lulu.luloos4530

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Distributed_Computing_series_2

Performance Metrics of Mutual Exclusion Algorithms

The performance of distributed mutual exclusion algorithms is generally measured using the following
four key metrics:

1. Message Complexity

Definition: The total number of messages required per Critical Section (CS) execution by a site.
Tip to memorize: How much network traffic does it create to enter the CS?

2. Synchronization Delay (SD)

Definition: The time interval required after one site leaves the CS and before the next site enters
the CS.
Tip to memorize: The gap or transition time between two consecutive CS executions.

3. Response Time

Definition: The total time interval a request waits for its CS execution to be over, measured from
the exact moment its request messages have been sent out.
Tip to memorize: Waiting Time + CS Execution Time.

4. System Throughput

Definition: The rate at which the system executes requests for the Critical Section.
Calculation: It depends on the Synchronization Delay (SD) and the average Critical Section
execution time (E).
Equation: System throughput = 1/(SD + E)

Comparison: Token-Based vs. Non-Token-Based Approach

Basis of Token-Based Approach Non-Token-Based Approach

Comparison
Basic Principle A unique token is circulated and Two or more successive rounds of
shared among the sites in the system. messages are exchanged among the
sites.
Condition to A site can enter the Critical Section A site enters the CS when an
Enter CS (CS) only if it possesses the assertion (defined on its local
unique token. variables) becomes true.
Basis of Token-Based Approach Non-Token-Based Approach
Comparison
Ensuring Mutual Guaranteed naturally because the Guaranteed by resolving conflicts
Exclusion token is strictly unique (only one through message exchange and
exists). priority mechanisms.
Conflict Uses sequence numbers to track Uses timestamps (Lamport-style
Resolution / requests and determine priority. logical clocks) to decide the priority of
Priority conflicting requests.
Holding the A site can hold the token and A site must request permission from
Right to CS repeatedly enter the CS until it sends other sites every time it wants to enter
the token to another site. the CS.
Special States Can have an "idle token state" (a Sites are simply in "requesting",
site holding the token is executing "executing", or "idle" states. There is
outside the CS). no idle token state.
Algorithm Suzuki–Kasami’s Broadcast Lamport’s Algorithm, Ricart–Agrawala
Examples Algorithm. Algorithm.

Main Issues in Implementing DSM Software

When designing a DSM system, the primary challenge is hiding the underlying message-passing
mechanism while maintaining efficiency. The main design issues are:

1. Semantics of Concurrent Access

Determining what semantics (rules) to allow when multiple processors access shared objects at the
same time.
These semantics must be clearly specified so programmers can write their application logic
appropriately.

2. Implementing Semantics (Replication Strategy)

Deciding the best way to implement concurrent access, usually through replication.
Requires deciding the degree of replication: Partial replication (at some sites) vs. Full replication
(at all sites).
Requires deciding the type of replication: Read-replication, write-replication, or both.

3. Placement of Replicas

Selecting the optimal locations for replication (if full replication is not used) to maximize system
efficiency and performance.

4. Locating Remote Data

Determining how to find the exact location of remote data that the application needs to access
(especially when full replication is not implemented).

5. Reducing Overhead
Minimizing communication delays and the number of hidden ("under the covers") messages
required to implement the shared data semantics.

Coordinated vs. Uncoordinated Checkpointing

This comparison focuses on how distributed systems save process states (checkpoints) to recover
from failures.

Basis of Coordinated Checkpointing Uncoordinated Checkpointing

Comparison
Basic Principle All processes synchronize to take Each process takes its checkpoints
their checkpoints at the same time. independently, without coordinating with
others.
Domino Effect Avoids the domino effect. This is Prone to the domino effect (cascaded
its primary advantage. rollbacks). This is its main disadvantage.
State Creates a globally consistent Only creates locally consistent states. A
Consistency state across the system with each globally consistent state (recovery line)
set of checkpoints. must be calculated after a failure.
Overhead High performance overhead. It Very low performance overhead.
during Normal requires extra messages for Processes do not need to communicate
Operation synchronization, and processes or wait for others to take checkpoints.
may need to block and wait.
Rollback & Simple and fast. The system Complex and potentially slow. It
Recovery simply rolls back to the last saved involves searching through stored
Process globally consistent state. checkpoints to find a consistent
recovery line, which may be far in the
past.
Storage Lower. Since the domino effect is Higher. Multiple checkpoints for each
Overhead avoided, older checkpoints can be process must be stored to allow for
discarded more aggressively. finding a recovery line.
Also Known As System-wide Checkpointing Independent Checkpointing

How Quorum-Based Algorithms Differ from Other Mutual

Exclusion Algorithms
Quorum-based mutual exclusion algorithms represent a significant departure from both non-token-
based (e.g., Lamport, Ricart-Agrawala) and token-based approaches. The core difference is that a site
does not need permission from all other sites to enter the critical section.

Here are the key points of differentiation:

1. The Permission Set: Subset vs. All

Quorum-Based: A site requests permission to enter the CS from only a pre-defined subset of
sites, called a quorum.
Other Approaches (Non-Token): A site must broadcast its request and receive permission from all
other sites in the system.

2. Message Complexity

Quorum-Based: This approach significantly reduces the message complexity. Instead of

sending messages to (N-1) sites, a site only communicates with the sites in its quorum. For
example, Maekawa's algorithm achieves a message complexity of approximately 3√N, which is
much better than the O(N) complexity of Lamport's or Ricart-Agrawala's algorithms.
Other Approaches: Message complexity is high as it is directly proportional to the number of sites
(N) in the system.

3. Conflict Resolution Mechanism

Quorum-Based: Conflict resolution is handled by the sites that are common to the intersecting
quorums of the competing sites. The "Intersection Property" of quorums guarantees that any two
quorums in the system have at least one site in common. This common site acts as the arbitrator,
ensuring only one request is granted at a time.
Other Approaches: In non-token-based systems, all sites participate in conflict resolution by
comparing timestamps of incoming requests.

4. Reply and Release Protocol

Quorum-Based: A site in a quorum can send only one REPLY message at any time. It cannot
grant permission to another site until it receives a RELEASE message from the site that it last gave
permission to. This effectively "locks" a site's vote.
Other Approaches: A site can often send multiple REPLY messages immediately if it is not
currently requesting or executing the CS itself.

Comparison Table

Feature Quorum-Based Approach Other Approaches (Non-Token & Token-

Based)
Permission A subset of sites (the quorum). All other sites (in non-token) or no one (if
Group you have the token).
Message Low (e.g., O(√N)). Its primary High (e.g., O(N)).
Complexity advantage.
Conflict Handled locally by the intersecting Handled globally by all sites comparing
Resolution sites of competing quorums. timestamps (non-token).
Resource More efficient and less network High network load due to broadcast
Usage load. messages.
Fault More resilient. The system can Less resilient (non-token). The failure of
Tolerance function even if some sites outside a single site can prevent any other site
a quorum fail. from entering the CS.
Maekawa's Algorithm for Mutual Exclusion
Maekawa's algorithm was the first and is a classic example of a quorum-based mutual exclusion
algorithm. Its primary goal is to reduce the message complexity required to achieve mutual exclusion,
making it more efficient than algorithms like Ricart-Agrawala which require communication with all other
sites.

1. Core Concepts

The algorithm is built upon the idea of Request Sets (Quorums).

Request Set (Ri): For each site Si , there is a specific subset of sites Ri (including Si itself) that
it must request permission from. This Ri is the quorum for site Si .
Intersection Property: The request sets are constructed in such a way that for any two different
sites, Si and Sj , their request sets must have at least one common site.

Ri ∩ Rj ≠ ∅ (This is the non-empty intersection property).

Arbitrator Site: This common site acts as an arbitrator or a "gatekeeper" to resolve concurrent
requests from Si and Sj , ensuring only one of them gets permission at a time.
Optimal Quorum Size: To minimize the size of the quorum, Maekawa showed that the size K for
a system of N sites is approximately √N . This leads to a message complexity of O(√N) instead of
O(N) .

2. The Algorithm in Detail

Example of Maekawa's Algorithm

Let's consider a system with N=7 sites {S1, S2, S3, S4, S5, S6, S7}. We can form quorums of size
K=3 that satisfy the intersection property.

Setup: Request Sets (Quorums)

R1 = {S1, S2, S4}

R2 = {S2, S3, S5}
R3 = {S3, S4, S6}
R4 = {S4, S5, S7}
R5 = {S1, S5, S6}
R6 = {S2, S6, S7}
R7 = {S1, S3, S7}

(Notice that any two sets have exactly one site in common. For example, R1 ∩ R2 = {S2})

Scenario:

1. Site S1 wants to enter the CS.

S1 sends REQUEST(1) to its quorum members: S1, S2, and S4.
Assuming S1, S2, and S4 are all Unlocked, they each send a REPLY back to S1.
S1, S2, and S4 now set their internal state to Locked/Voted.
S1 receives 3 REPLY messages (from S1, S2, S4). It now has permission and enters the CS.
2. While S1 is in the CS, Site R3 wants to enter.
S3 sends REQUEST(3) to its quorum members: S3, S4, and S6.
S3 sends a REPLY to itself and becomes Locked.
S6 is currently Unlocked, so it sends a REPLY to S3 and becomes Locked.
S4 (the intersection of R1 and R3) receives the REQUEST(3) . However, S4 is already Locked
because it gave its vote to S1.
Therefore, S4 does not reply to S3. It adds REQUEST(3) to its queue.
Result: S3 has received only two REPLY messages (from S3 and S6). It is still waiting for the
reply from S4, so S3 is blocked and cannot enter the CS. Mutual exclusion is achieved.
3. Site S1 exits the CS.
S1 sends RELEASE(1) to its quorum members: S1, S2, and S4.
S1, S2, and S4 receive the RELEASE and set their state to Unlocked/Free.
At Site S4: As soon as it becomes Unlocked, it checks its queue. It finds the pending
REQUEST(3) from S3.
S4 immediately sends a REPLY to S3 and sets its state back to Locked/Voted (now for S3).
4. S3 Enters the CS.
S3 now receives the final required REPLY from S4.
Having received permissions from its entire quorum {S3, S4, S6}, S3 enters the CS.

3. Analysis and Conclusion

Advantage: The primary advantage is low message complexity. In this example, entering the CS
required 2*(K-1) messages (REQUEST/REPLY) and (K-1) messages (RELEASE), for a total of
3*(K-1) = 3*2 = 6 messages. A non-quorum algorithm would have required 2*(N-1) = 2*6 =
12 messages.
Disadvantage: The basic algorithm is prone to deadlock. A circular wait can form where sites are
waiting for REPLY messages from other sites that are themselves waiting. This requires more
complex solutions, such as using timestamps to prioritize requests.

Lamport's Algorithm for Mutual Exclusion

Lamport's algorithm is a foundational non-token-based, permission-based algorithm for distributed
mutual exclusion. Its core idea is to use timestamps (from Lamport's logical clocks) to create a total
ordering of all requests for the Critical Section (CS). This ensures that requests are granted on a fair,
first-come, first-served basis.

1. Core Concepts & Requirements

Timestamps: Each request is assigned a unique timestamp (ts, i) , where ts is the local
logical clock time and i is the site ID. This tuple ensures a total ordering:
If timestamps are different, the lower timestamp has higher priority.
If timestamps are the same (a rare event), the lower site ID ( i ) gets higher priority.
Request Queue: Every site Si maintains a local request_queuei . This queue stores all CS
requests that the site is aware of (including its own) and is always kept sorted in ascending order
of timestamps.
Communication Channels: The algorithm requires that communication channels between any two
sites are FIFO (First-In, First-Out). This ensures that messages from site A to site B are
processed in the order they were sent.
Message Types:
1. REQUEST : To ask for permission to enter the CS.
2. REPLY : To grant permission.
3. RELEASE : To announce that the CS has been exited.

[Link] Algorithm in Detail

Example of Lamport's Algorithm

Let's consider a system with three sites: S1, S2, and S3. All request queues are initially empty.

Scenario:

1. S1 requests the CS:

S1's clock is at 0. It creates a timestamp (1, 1).
S1 adds (1, 1) to its queue: Q1 = [(1, 1)] .
S1 broadcasts REQUEST(1, 1) to S2 and S3.
2. S3 requests the CS (slightly later):
S3's clock is at 0. It creates a timestamp (1, 3).
S3 adds (1, 3) to its queue: Q3 = [(1, 3)] .
S3 broadcasts REQUEST(1, 3) to S1 and S2.
3. Message Reception and Queue Updates:
S2 receives REQUEST(1, 1) from S1. It adds it to its queue: Q2 = [(1, 1)] . S2 sends a
REPLY to S1.
S3 receives REQUEST(1, 1) from S1. It adds it to its queue and sorts it: Q3 = [(1, 1), (1,
3)] . S3 sends a REPLY to S1.
S1 receives REQUEST(1, 3) from S3. It adds it to its queue and sorts it: Q1 = [(1, 1), (1,
3)] . S1 sends a REPLY to S3.
S2 receives REQUEST(1, 3) from S3. It adds it to its queue and sorts it: Q2 = [(1, 1), (1,
3)] . S2 sends a REPLY to S3.
4. Checking Conditions to Enter CS:
Check for S1:
L1: Has S1 heard from everyone with a later timestamp? Yes, it received REQUEST(1, 3)
from S3 and a REPLY from S2. It is now aware of all other sites' intentions. (Condition
Met).
L2: Is S1's request (1, 1) at the top of its queue Q1 = [(1, 1), (1, 3)] ? Yes.
(Condition Met).
Conclusion: S1 enters the Critical Section.
Check for S3:
L1: Has S3 heard from everyone? Yes, it received replies/requests from S1 and S2.
(Condition Met).
L2: Is S3's request (1, 3) at the top of its queue Q3 = [(1, 1), (1, 3)] ? No, (1, 1)
has a higher priority. (Condition Fails).
Conclusion: S3 must wait.
5. S1 Releases the CS:
S1 exits the CS.
S1 removes (1, 1) from its queue: Q1 = [(1, 3)] .
S1 broadcasts a RELEASE message to S2 and S3.
6. S2 and S3 Process the Release:
S2 receives the RELEASE and removes (1, 1) from its queue: Q2 = [(1, 3)] .
S3 receives the RELEASE and removes (1, 1) from its queue: Q3 = [(1, 3)] .
7. S3 Re-checks Conditions:
L1: Still met.
L2: Is S3's request (1, 3) now at the top of its queue Q3 = [(1, 3)] ? Yes. (Condition Met).
Conclusion: S3 can now enter the Critical Section.

Analysis
Message Complexity: To enter the CS once, a site sends (N-1) REQUEST s, receives (N-1)
REPLY s, and sends (N-1) RELEASE s. The total is 3(N-1) messages, which is high.
Guarantees: The algorithm guarantees mutual exclusion, is free from deadlock, and ensures
fairness (no starvation).

Here are the exam-focused notes for Deadlock Handling Strategies in a Distributed Environment:

Introduction: Why is it complicated?

In distributed systems, handling deadlocks is highly complicated because:

1. No global state: No single site has accurate knowledge of the current, up-to-date state of the
entire system.
2. Communication delays: Every inter-site communication involves finite but unpredictable delays.
The Three Deadlock Handling Strategies
There are three primary strategies to handle deadlocks. In distributed systems, the first two are
generally impractical, making the third (Detection) the preferred choice.

1. Deadlock Prevention

How it works: Restrains how processes can request resources to ensure a deadlock can never
happen. This is commonly achieved by:
Forcing a process to acquire all needed resources simultaneously before it begins
execution.
Pre-empting a process that holds a needed resource.
Verdict: Highly inefficient and impractical in distributed systems because it severely limits
concurrency and resource utilization.

2. Deadlock Avoidance

How it works: A resource is granted to a process only if the resulting global system state is proven
to be "safe" (i.e., a state where a deadlock will definitely not occur).
Verdict: Impractical in distributed systems. Determining a "safe" global state requires global
knowledge of all resources and requests, which is impossible to maintain accurately due to
message delays and the lack of a global clock.

3. Deadlock Detection (The Best Approach)

How it works: Allows the system to run without strict restrictions. It periodically examines the
status of process-resource interactions to check for the presence of a cyclic wait (a deadlock
condition).
Verdict: This is the best and most practical approach for handling deadlocks in distributed
systems.

Deep Dive: Deadlock Detection & Resolution

Handling deadlocks using the detection approach entails addressing two basic issues:

A. Detection of existing deadlocks

Detection involves maintaining a Wait-For Graph (WFG) and searching it for cycles.

Wait-For Graph (WFG): A directed graph where nodes represent processes. A directed edge from
P 1 to P 2 exists if process P 1 is blocked and waiting for P 2 to release some resource.

Condition: A system is deadlocked if and only if there exists a directed cycle or knot in the WFG.
Correctness Criteria for Detection Algorithms:
A valid deadlock detection algorithm must satisfy two conditions:

1. Progress (No undetected deadlocks): The algorithm must detect all existing deadlocks in a finite
time. It should not wait for more events to occur once all deadlock dependencies have formed.
2. Safety (No false deadlocks): The algorithm should not report deadlocks that do not actually exist.
(False deadlocks are also called phantom deadlocks). Phantom deadlocks often happen in
distributed systems because sites might obtain out-of-date or inconsistent WFG information.

B. Resolution of detected deadlocks

Once a deadlock is detected, it must be resolved.

How it works: Resolution involves breaking the existing wait-for dependencies between
processes.
Action taken: It involves aborting or rolling back one or more deadlocked processes. The
resources held by the rolled-back process are then assigned to blocked processes so they can
resume execution.

Consistent vs. Inconsistent Global State

In distributed systems, a "global state" is a collection of the individual local states of all participating
processes along with the states of the communication channels. During failure recovery and
checkpointing, ensuring the system rolls back to a consistent state is crucial.

Feature Consistent Global State Inconsistent Global State

Definition A state that could legally occur during a A state that could never occur in a
normal, failure-free execution of the correct, failure-free execution. It is an
distributed computation. impossible physical state.
Feature Consistent Global State Inconsistent Global State
Causality Rule If a process's state reflects the receipt A process's state reflects the receipt
(Send vs. of a message, the sender's state must of a message, but the sender's state
Receive) reflect the sending of that message. does not reflect having sent it.
Orphan Contains no orphan messages Contains orphan messages.
Messages (messages received but not sent).
In-Transit May contain in-transit messages May also contain in-transit
Messages (messages sent but not yet received). messages, but the defining error is
This is perfectly valid and normal. the orphan messages.
System A valid state to which the system can The system cannot resume from this
Recovery safely roll back and resume execution. state because it violates the basic
laws of cause and effect.

Detailed Example
To understand this, let's look at the classic examples of processes exchanging messages and taking
checkpoints.

1. Example of a Consistent State (Figure a in PDF):

Scenario: Process P sends a message m to P . The system state is captured after P sends m ,
1 1 2 1 1

but before P receives it.

Why is it consistent? This represents a normal "in-transit" message. It is completely logical for a
message to be on the network (sent) but not yet processed by the receiver. Every message that
has been received has a corresponding "send" event recorded. The system can safely resume from
here.

2. Example of an Inconsistent State (Figure b in PDF):

Scenario: Process P sends message m to P . P receives it and takes a checkpoint. Later, P

1 2 2 2 1

fails and rolls back to a checkpoint taken before it ever sent m . 2

Why is it inconsistent? If we look at the combined global state now, process P 's state says, "I
2

received message m ", but process P 's state says, "I never sent message m ".
2 1 2

The Problem: This violates causality. You cannot receive a message that was never sent. This is
called an orphan message. Such a state is impossible in a failure-free correct computation, and
the system must take further actions (like rolling back P as well) to fix this inconsistency.
2
What is Log-Based Rollback Recovery?
Log-based rollback recovery (also known as message logging) is a fault-tolerance technique in
distributed systems. It combines periodic checkpointing with the logging of nondeterministic
events (such as the receipt of a message).

If a process fails, the system restores the most recent checkpoint and then "replays" the logged
messages in their original order to perfectly reconstruct the process's state right up to the moment of
failure.

The Core Principle: Piecewise Deterministic (PWD) Assumption

This entire method relies heavily on the Piecewise Deterministic (PWD) assumption.

What it means: It assumes that a process's execution is just a sequence of deterministic state
intervals.
The Trigger: Each interval is triggered by a nondeterministic event (like receiving an external
message or an interrupt).
The Guarantee: If a system captures the initial state (via a checkpoint) and records all the
nondeterministic events (in a log), it can deterministically recreate any pre-failure state by simply
feeding those exact same events back into the process in the exact same order.

How the Mechanism Works

1. Checkpointing: Processes periodically save their state to stable storage.

2. Logging (Determinants): During normal execution, whenever a nondeterministic event occurs
(like receiving a message), all the information needed to replay that event (called a determinant) is
recorded in a log file.
3. Failure Recovery:
The failed process is rolled back to its most recent local checkpoint.
The system reads the log file and replays the saved messages/events.
The process executes deterministically, returning exactly to the state it was in before it crashed.

Three Types of Log-Based Recovery

Depending on when and how the logs are written to stable storage, this approach is categorized into
three types:

1. Pessimistic Logging: Every event is immediately saved to stable storage before it is allowed to
affect the system state. It guarantees no orphan processes and makes recovery very fast, but adds
high latency during normal execution.
2. Optimistic Logging: Events are processed immediately and logged to volatile memory first, then
written to stable storage asynchronously later. It offers great performance during normal operation
but makes the recovery process much more complex.
3. Causal Logging: Strikes a balance between the two. It logs determinants by tracking causal
dependencies piggybacked on messages, reducing the performance hit of pessimistic logging
without the extreme recovery complexity of optimistic logging.

Advantages & Disadvantages

Advantages:

Avoids the Domino Effect: Failed processes can be brought completely up to date individually.
Other processes usually do not need to roll back just because one process failed.
More Recent State: Replaying logs allows the process to recover to a state much closer to the
crash than checkpointing alone.

Disadvantages:

Performance Overhead: Constantly saving message logs to stable storage consumes disk space
and network bandwidth.
Duplicate Messages: As mentioned in the PDF issues, when a process replays its log to handle
lost messages, it might accidentally resend messages that other processes have already received,
resulting in duplicate messages.

Lamport’s Bakery Algorithm

Lamport’s Bakery Algorithm is a classic solution for the n-process mutual exclusion problem in
shared memory systems. It is called the "Bakery Algorithm" because it mimics the way customers are
served in a bakery: you take a ticket number, and the person with the lowest number is served first.

1. Core Concept & Analogy

Taking a Number: A process wanting to enter the Critical Section (CS) picks a "ticket number"
(timestamp).
The Rule: The process with the lowest ticket number enters the CS first.
Tie-Breaking (Lexicographical Order): If two processes pick the same ticket number at the same
time, the process with the lower Process ID (PID) is given priority.
This is represented as the tuple: [ticket_number, PID].

2. Data Structures Used

The algorithm uses two shared arrays:

1. choosing[1...n] (Boolean): choosing[i] is true if process P is currently in the process of

picking its ticket number.

2. timestamp[1...n] (Integer): timestamp[i] stores the ticket number of process P . A value of 0
i

means the process is not requesting the CS.

3. The Algorithm Steps

Phase 1: Entry Section (Taking a Ticket)

1. Set choosing[i] = true .

2. Pick a number: timestamp[i] = max(timestamp[1...n]) + 1 .
3. Set choosing[i] = false .

Phase 2: The Waiting Loop (Checking Others)

For every other process j from 1 to n:

1. Wait if process j is currently picking a number ( choosing[j] is true).

2. Wait if process j has a valid ticket AND (P has a smaller ticket OR P has the same ticket but a
j j

smaller PID).

Phase 3: Critical Section

The process executes its Critical Section.

Phase 4: Exit Section

Set timestamp[i] = 0 (Releasing the ticket).

4. Requirements Satisfied

The algorithm is proven to satisfy the three requirements of the critical section problem:

1. Mutual Exclusion: Only one process can have the lowest [timestamp, PID] tuple at any time.
2. Progress: A process is guaranteed to enter the CS because ticket numbers always increase.
3. Bounded Waiting: A process i can be overtaken by another process j at most once after i has
picked its ticket number.

5. Performance & Complexity

Space Complexity: O(n) registers (requires an array of n timestamps). This is the lower bound;
you cannot have a more space-efficient algorithm for this problem.
Time Complexity: O(n) because a process must check the status of all n − 1 other processes
before entering the CS.
Disadvantage: In the worst case, it does not guarantee a bounded delay (very high wait times).

Types of Messages in Rollback Recovery

When a process fails and the system rolls back to a previous "recovery line" (a set of checkpoints),
several messages may end up in abnormal states. Understanding these is crucial for restoring a
consistent system state.

1. In-Transit Messages

Definition: Messages that have been sent but not yet received at the time the global state was
captured.
Status: These are not problematic. In a reliable network, they will eventually be delivered. A
consistent global state must include these to ensure they are handled in any legal execution.

2. Lost Messages

Definition: Messages whose "send" is recorded (it happened before the sender's checkpoint) but
the "receive" is undone (it happened after the receiver's checkpoint).
Problem: The receiver "forgets" it ever got the message.
Handling: Usually handled by message logging, where the sender replays the message from its
log during recovery.

3. Delayed Messages

Definition: Messages whose "receive" is not recorded because the receiving process was either
down or the message arrived after the process had already rolled back.
Problem: Similar to lost messages, the system must ensure these are eventually processed
correctly once the receiver is back online.

4. Orphan Messages

Definition: Messages whose "receive" is recorded but the "send" is undone due to rollback.
Problem: This is a serious error (violates causality). It creates a state where a message was
received but never sent.
Solution: These are avoided if the system rolls back to a consistent global state.

5. Duplicate Messages

Definition: Messages that are received more than once by a process.

Cause: This usually occurs during recovery when a process replays its message log to handle
lost messages, unintentionally resending a message that was already received before the failure.

6. Delayed Orphan Messages

Definition: A combination where the "send" event is undone due to a sender rolling back, but the
message is still traveling through the network.
Problem: When these messages finally arrive at their destination, they must be discarded
because their origin no longer exists in the current system state.

Summary Table

Message Type Send Event Receive Event Status/Problem

In-Transit Recorded Not yet happened Valid / Normal
Lost Recorded Undone by rollback Needs re-sending (logging)
Orphan Undone by rollback Recorded Impossible State (Causality error)
Duplicate Sent twice Received twice Result of log replaying
Delayed Orphan Undone Arrives post-rollback Must be discarded

Suzuki-Kasami Algorithm Overview
No ratings yet
Suzuki-Kasami Algorithm Overview
44 pages
DCM 3
No ratings yet
DCM 3
46 pages
Distributed Mutual Exclusion Algorithms
No ratings yet
Distributed Mutual Exclusion Algorithms
38 pages
Distributed Mutual Exclusion Algorithms
No ratings yet
Distributed Mutual Exclusion Algorithms
38 pages
Distributed Mutual Exclusion Methods
No ratings yet
Distributed Mutual Exclusion Methods
11 pages
WINSEM2025-26 VL CSI3012 00100 ETH 2026-03-14 Distributed-Mutual-Exclusion
No ratings yet
WINSEM2025-26 VL CSI3012 00100 ETH 2026-03-14 Distributed-Mutual-Exclusion
71 pages
Token vs Non-Token Algorithms in Distributed Systems
No ratings yet
Token vs Non-Token Algorithms in Distributed Systems
8 pages
DC - Module 3-2
No ratings yet
DC - Module 3-2
14 pages
Ricart-Agrawala Algorithm Overview
No ratings yet
Ricart-Agrawala Algorithm Overview
23 pages
Mutual Exclusion in Process Synchronization
No ratings yet
Mutual Exclusion in Process Synchronization
27 pages
KTU Notes on Distributed Computing
No ratings yet
KTU Notes on Distributed Computing
39 pages
DS Unit 5
No ratings yet
DS Unit 5
10 pages
Distributed Mutual Exclusion Algorithms
No ratings yet
Distributed Mutual Exclusion Algorithms
23 pages
Module 3-1
No ratings yet
Module 3-1
42 pages
Distributed Mutual Exclusion Algorithms
No ratings yet
Distributed Mutual Exclusion Algorithms
34 pages
Distributed Mutual Exclusion Algorithms
No ratings yet
Distributed Mutual Exclusion Algorithms
49 pages
Distributed Mutual Exclusion Algorithms
No ratings yet
Distributed Mutual Exclusion Algorithms
46 pages
Distributed Mutex and Deadlock Algorithms
No ratings yet
Distributed Mutex and Deadlock Algorithms
102 pages
Distributed Mutex and Deadlock Analysis
No ratings yet
Distributed Mutex and Deadlock Analysis
27 pages
Understanding Mutual Exclusion in Distributed Systems
No ratings yet
Understanding Mutual Exclusion in Distributed Systems
24 pages
Distributed Mutual Exclusion Algorithms
No ratings yet
Distributed Mutual Exclusion Algorithms
60 pages
Distributed Mutex and Maekawa's Algorithm
No ratings yet
Distributed Mutex and Maekawa's Algorithm
21 pages
Distributed Mutual Exclusion Algorithms
No ratings yet
Distributed Mutual Exclusion Algorithms
28 pages
Understanding Mutual Exclusion in Computing
No ratings yet
Understanding Mutual Exclusion in Computing
14 pages
Distributed Mutual Exclusion Algorithms
No ratings yet
Distributed Mutual Exclusion Algorithms
20 pages
Distributed Mutex and Deadlock Algorithms
No ratings yet
Distributed Mutex and Deadlock Algorithms
20 pages
Mutual Exclusion in Distributed Systems
No ratings yet
Mutual Exclusion in Distributed Systems
45 pages
Distributed Mutual Exclusion Algorithms
No ratings yet
Distributed Mutual Exclusion Algorithms
130 pages
DCS2024 (5) Exclusion and Byzantine General
No ratings yet
DCS2024 (5) Exclusion and Byzantine General
89 pages
Distributed Mutual Exclusion Algorithms
No ratings yet
Distributed Mutual Exclusion Algorithms
21 pages
Maekawa's Algorithm for Mutual Exclusion
No ratings yet
Maekawa's Algorithm for Mutual Exclusion
10 pages
Suzuki-Kasami Algorithm Overview
No ratings yet
Suzuki-Kasami Algorithm Overview
48 pages
Distributed Mutual Exclusion Algorithms
No ratings yet
Distributed Mutual Exclusion Algorithms
29 pages
Distributed Mutual Exclusion Algorithms
No ratings yet
Distributed Mutual Exclusion Algorithms
22 pages
Module 3-Upload Distrubuted Computing Ktu
No ratings yet
Module 3-Upload Distrubuted Computing Ktu
60 pages
Distributed Mutual Exclusion Mechanisms
No ratings yet
Distributed Mutual Exclusion Mechanisms
4 pages
1.distributed Mutual Exclusion Algorithm
No ratings yet
1.distributed Mutual Exclusion Algorithm
21 pages
Distributed Mutex Algorithms Overview
No ratings yet
Distributed Mutex Algorithms Overview
24 pages
Distributed Mutual Exclusion Algorithms
No ratings yet
Distributed Mutual Exclusion Algorithms
45 pages
Distributed Mutual Exclusion Algorithms
No ratings yet
Distributed Mutual Exclusion Algorithms
90 pages
Overview of Distributed Mutual Exclusion
No ratings yet
Overview of Distributed Mutual Exclusion
38 pages
Distributed Mutex and Deadlock Algorithms
No ratings yet
Distributed Mutex and Deadlock Algorithms
18 pages
Distributed Mutual Exclusion Algorithms
No ratings yet
Distributed Mutual Exclusion Algorithms
20 pages
Distributed Mutual Exclusion Algorithms
100% (1)
Distributed Mutual Exclusion Algorithms
39 pages
Distributed Mutual Exclusion & Deadlock Algorithms
No ratings yet
Distributed Mutual Exclusion & Deadlock Algorithms
19 pages
Module 3
No ratings yet
Module 3
38 pages
A Da/ Algorithm For Mutual Exclusion in Decentralized Systems
No ratings yet
A Da/ Algorithm For Mutual Exclusion in Decentralized Systems
15 pages
DC Unit3 Notes
No ratings yet
DC Unit3 Notes
26 pages
Distributed Mutual Exclusion & Deadlock Algorithms
No ratings yet
Distributed Mutual Exclusion & Deadlock Algorithms
13 pages
Distributed Mutual Exclusion Methods
No ratings yet
Distributed Mutual Exclusion Methods
25 pages
Mutual Exclusion Algorithms Overview
No ratings yet
Mutual Exclusion Algorithms Overview
7 pages
Mutual Exclusion Algorithms Overview
No ratings yet
Mutual Exclusion Algorithms Overview
36 pages
Suzuki-Kasami Algorithm Overview
No ratings yet
Suzuki-Kasami Algorithm Overview
29 pages
Critical Section & Mutual Exclusion Explained
No ratings yet
Critical Section & Mutual Exclusion Explained
14 pages
Distributed Mutual Exclusion Algorithms
No ratings yet
Distributed Mutual Exclusion Algorithms
28 pages
Understanding Mutual Exclusion Algorithms
No ratings yet
Understanding Mutual Exclusion Algorithms
58 pages
Token-Based Algorithms for Mutual Exclusion
No ratings yet
Token-Based Algorithms for Mutual Exclusion
18 pages
Matrix Concepts for JEE Preparation
No ratings yet
Matrix Concepts for JEE Preparation
144 pages
CFD Theory - Recap
No ratings yet
CFD Theory - Recap
76 pages
Telecommunications Concepts Quiz
No ratings yet
Telecommunications Concepts Quiz
7 pages
BSc Applied Mathematics Admission Guide
No ratings yet
BSc Applied Mathematics Admission Guide
36 pages
Seismic Stability of Reinforced Walls
No ratings yet
Seismic Stability of Reinforced Walls
8 pages
Numerical Analysis Exam Questions
No ratings yet
Numerical Analysis Exam Questions
4 pages
GMSH Manual
No ratings yet
GMSH Manual
270 pages
Pascal Programming Assignment Guide
No ratings yet
Pascal Programming Assignment Guide
5 pages
Curiosidades Historia 123a
No ratings yet
Curiosidades Historia 123a
263 pages
AC Circuits and Transformer Analysis
No ratings yet
AC Circuits and Transformer Analysis
12 pages
GPA Influences: Study, Sleep, Exercise
No ratings yet
GPA Influences: Study, Sleep, Exercise
14 pages
E-Procurement Systems Assimilation Patterns
No ratings yet
E-Procurement Systems Assimilation Patterns
14 pages
Understanding Mathematical Language and Symbols
No ratings yet
Understanding Mathematical Language and Symbols
9 pages
Understanding Algebraic Expressions
No ratings yet
Understanding Algebraic Expressions
4 pages
Borel-Cantelli Lemma in Diophantine Approximation
No ratings yet
Borel-Cantelli Lemma in Diophantine Approximation
14 pages
Geometry Area Problems and Solutions
No ratings yet
Geometry Area Problems and Solutions
5 pages
Geometric Design Standards and Safety
No ratings yet
Geometric Design Standards and Safety
13 pages
Hierarchical Clustering Explained
No ratings yet
Hierarchical Clustering Explained
7 pages
Centrality Measures in Social Networks
No ratings yet
Centrality Measures in Social Networks
28 pages
Expanding Brackets with Pascal's Triangle
No ratings yet
Expanding Brackets with Pascal's Triangle
14 pages
Classification of Number Types Explained
No ratings yet
Classification of Number Types Explained
32 pages
Applied Computational Fluid Dynamics
100% (1)
Applied Computational Fluid Dynamics
354 pages
Understanding Polar Coordinates Basics
No ratings yet
Understanding Polar Coordinates Basics
14 pages
Understanding Cartesian Coordinates
No ratings yet
Understanding Cartesian Coordinates
4 pages
Solution Manual For Contemporary Business Mathematics Canadian 10Th Edition by Hummelbrunner Halliday Coombs Isbn 0133052311 9780133052312
100% (82)
Solution Manual For Contemporary Business Mathematics Canadian 10Th Edition by Hummelbrunner Halliday Coombs Isbn 0133052311 9780133052312
88 pages
Petroleum Product Blending Techniques
67% (3)
Petroleum Product Blending Techniques
4 pages
Insulation Coordination in Power Systems
100% (1)
Insulation Coordination in Power Systems
11 pages
LP Formulations for Production Optimization
No ratings yet
LP Formulations for Production Optimization
5 pages
Visualizing Normal Distribution Fit
No ratings yet
Visualizing Normal Distribution Fit
12 pages
Understanding Wave Properties and Calculations
No ratings yet
Understanding Wave Properties and Calculations
4 pages

Distributed Computing Series 2

Uploaded by

Distributed Computing Series 2

Uploaded by

Distributed_Computing_series_2

Performance Metrics of Mutual Exclusion Algorithms

2. Synchronization Delay (SD)

Comparison: Token-Based vs. Non-Token-Based Approach

Basis of Token-Based Approach Non-Token-Based Approach

Main Issues in Implementing DSM Software

1. Semantics of Concurrent Access

2. Implementing Semantics (Replication Strategy)

4. Locating Remote Data

Coordinated vs. Uncoordinated Checkpointing

Basis of Coordinated Checkpointing Uncoordinated Checkpointing

How Quorum-Based Algorithms Differ from Other Mutual

Here are the key points of differentiation:

1. The Permission Set: Subset vs. All

Quorum-Based: This approach significantly reduces the message complexity. Instead of

3. Conflict Resolution Mechanism

4. Reply and Release Protocol

Feature Quorum-Based Approach Other Approaches (Non-Token & Token-

The algorithm is built upon the idea of Request Sets (Quorums).

Ri ∩ Rj ≠ ∅ (This is the non-empty intersection property).

2. The Algorithm in Detail

Example of Maekawa's Algorithm

Setup: Request Sets (Quorums)

R1 = {S1, S2, S4}

1. Site S1 wants to enter the CS.

3. Analysis and Conclusion

Lamport's Algorithm for Mutual Exclusion

1. Core Concepts & Requirements

[Link] Algorithm in Detail

1. S1 requests the CS:

Introduction: Why is it complicated?

3. Deadlock Detection (The Best Approach)

Deep Dive: Deadlock Detection & Resolution

A. Detection of existing deadlocks

B. Resolution of detected deadlocks

Once a deadlock is detected, it must be resolved.

Consistent vs. Inconsistent Global State

Feature Consistent Global State Inconsistent Global State

1. Example of a Consistent State (Figure a in PDF):

but before P receives it.

2. Example of an Inconsistent State (Figure b in PDF):

Scenario: Process P sends message m to P . P receives it and takes a checkpoint. Later, P

fails and rolls back to a checkpoint taken before it ever sent m . 2

The Core Principle: Piecewise Deterministic (PWD) Assumption

How the Mechanism Works

1. Checkpointing: Processes periodically save their state to stable storage.

Three Types of Log-Based Recovery

Advantages & Disadvantages

Lamport’s Bakery Algorithm

1. Core Concept & Analogy

2. Data Structures Used

The algorithm uses two shared arrays:

1. choosing[1...n] (Boolean): choosing[i] is true if process P is currently in the process of

picking its ticket number.

means the process is not requesting the CS.

3. The Algorithm Steps

1. Set choosing[i] = true .

Phase 2: The Waiting Loop (Checking Others)

For every other process j from 1 to n:

1. Wait if process j is currently picking a number ( choosing[j] is true).

Phase 3: Critical Section

The process executes its Critical Section.

Phase 4: Exit Section

Set timestamp[i] = 0 (Releasing the ticket).

5. Performance & Complexity

Types of Messages in Rollback Recovery

Definition: Messages that are received more than once by a process.

6. Delayed Orphan Messages

Message Type Send Event Receive Event Status/Problem

You might also like