0% found this document useful (0 votes)

2 views9 pages

Key Assumptions in Agreement Algorithms

Unit 4 Distributed computing

Uploaded by

aravindram.socmedia

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views9 pages

Key Assumptions in Agreement Algorithms

Unit 4 Distributed computing

Uploaded by

aravindram.socmedia

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Unit 4

4. What are the key assumptions underlying

while designing agreement algorithms? Brief
them.
Short Key Points
Agreement algorithms assume a well-defined system model.

Important assumptions:

1. Timing model – synchronous / asynchronous / partially synchronous.

2. Fault model – crash, omission, Byzantine.

3. Communication model – reliable channels, FIFO / non-FIFO, message delays.

4. Process behavior – deterministic execution.

5. Known system size – number of processes (n) and maximum faulty processes (f).

6. Authentication assumptions – signatures, no message forgery, etc.

Detailed Answer
Agreement algorithms (consensus) require processes in a distributed system to agree on a
single value. To guarantee correctness, several core system assumptions must be clarified.

1. Timing Model Assumptions

Agreement protocols rely on assumptions about time:

Synchronous systems:

Unit 4 1
Known upper bounds on message delay and processing time.

Easier to design consensus (e.g., phase-based, round-based algorithms).

Asynchronous systems:

No timing guarantees.

FLP impossibility: deterministic consensus impossible with even 1 crash failure.

Partially synchronous systems:

System behaves synchronously after some unknown global stabilization time.

Used in practical BFT protocols like PBFT.

2. Fault Model Assumptions

Algorithms must know what type of failures may occur:

Crash failures: Process stops and does nothing further.

Omission failures: Fails to send/receive some messages.

Byzantine failures: Arbitrary misbehavior — sending conflicting or corrupted messages.

Bound on failures: Maximum f faulty processes must be known in advance.

E.g., Byzantine agreement requires n > 3f.

3. Communication Model Assumptions

Reliable channels: messages are never lost, duplicated, or corrupted.

FIFO channels (optional): messages between two processes arrive in sending order.

No forgery: one process cannot impersonate another.

These assumptions affect the design of total order broadcast, consensus, and Byzantine
tolerance.

4. Process Behavior Assumptions

Processes are deterministic given the same inputs.

Shared algorithm logic — all processes follow identical code.

Use of randomization (for randomized consensus) is explicitly allowed in some models.

Unit 4 2
5. Knowledge of System Size
Every process knows:

Number of processes ( n ).

Upper bound on faults ( f ).

Consensus correctness conditions rely on these (e.g., n > 2f or n > 3f ).

6. Authentication Assumptions
Presence or absence of:

Message authentication codes,

Digital signatures,

Cryptographic hash chains, etc.

These assumptions affect correctness bounds (e.g., authenticated Byzantine systems require
only n > 2f).

Conclusion
Designing agreement algorithms requires a precise understanding of system timing, fault
type, communication reliability, and process behavior. These assumptions determine
possibility, complexity, and fault tolerance bounds of consensus algorithms.

5. Construct the rollback recovery process using

a coordinated checkpointing algorithm and
illustrate how Juang–Venkatesan asynchronous
checkpointing ensures consistency
Short Key Points
Rollback recovery uses checkpoints + logs to restore process state after a crash.

Coordinated checkpointing: All processes take checkpoints in a globally consistent

manner.

Avoids domino effect and orphan messages.

Unit 4 3
Juang–Venkatesan algorithm: Asynchronous, uses dependency vectors to create
consistent global snapshots without stopping the system.

Works even with frequent failures.

A. Rollback Recovery Using Coordinated Checkpointing

Concept
Coordinated checkpointing forces all processes to take a checkpoint such that the collection
forms a consistent global state.

A global state is consistent if no message is received before it is sent (no orphan messages).

Algorithm Steps
1. Initiation:
One process acts as a coordinator and sends a “CHECKPOINT REQUEST” to all
processes.

2. Quiescence or Blocking Phase:

Each process completes current events, blocks new messages temporarily, and takes a
local checkpoint.

3. Acknowledgment:

After checkpointing, each process sends “CHECKPOINT DONE” to the coordinator.

4. Commit:

If all processes respond correctly, the coordinator broadcasts a COMMIT

CHECKPOINT message.
All processes mark their checkpoints as permanent.

5. Abort (if failure occurs):

If a process fails during checkpointing, coordinator sends ABORT and older checkpoints
remain valid.

Advantages
No domino effect.

Recovery is quick — restart from last global checkpoint.

Unit 4 4
No orphan messages.

Disadvantages
Blocking processes temporarily → latency.

High coordination overhead.

B. Juang–Venkatesan Asynchronous Checkpointing

Algorithm
Goal
To obtain global consistent checkpoints without blocking all processes.

Main Idea
Uses:

Local checkpoints,

Message dependency tracking,

Checkpoint initiation tokens,

to ensure consistency.

Algorithm Operation
1. Checkpoint Initiation

Any process can initiate a checkpoint by taking a local checkpoint and sending a
CHECKPOINT TOKEN to all others.

2. Dependency Tracking

Every message carries the sender’s dependency vector.

If a process receives a message from a sender whose checkpoint is earlier, it also

takes a forced checkpoint.

3. Propagation

This ensures that if process A's state depends on process B’s state, then B also takes
a checkpoint to maintain consistency.

4. Termination

Unit 4 5
When all processes have taken necessary checkpoints, token circulation ends.

The set of checkpoints obtained is globally consistent.

Why It Works
Forced checkpoints ensure all message causal paths are included.

Eliminates inconsistent states (no missing messages).

No blocking required — useful for high-availability systems.

Advantages over coordinated checkpointing

Non-blocking.

Better suited for long-running or real-time distributed systems.

Handles frequent failures better.

Conclusion
Coordinated checkpointing ensures consistency through blocking, whereas Juang–Venkatesan
asynchronous checkpointing uses dependency tracking to achieve non-blocking, consistent
snapshots in a distributed system.

6. Illustrate in detail the different types of

failures in distributed systems
Short Key Points
Distributed systems can experience multiple failures:

1. Crash Failure

2. Omission Failure (Send/Receive)

3. Timing Failure

4. Arbitrary/Byzantine Failure

5. Network Failure (Partitioning)

6. Response & Value Failure

Unit 4 6
7. Security Failures

Detailed Answer
Distributed systems must handle failures that arise due to hardware faults, software bugs,
timing violations, or adversarial behavior. Key failure categories:

1. Crash Failures
A process stops execution unexpectedly but was working correctly before the crash.

Most common and easiest to handle.

Example: process stops due to power outage.

Detection: Timeout-based heartbeat monitoring.

2. Omission Failures
Failures to send or receive messages.

a. Send Omission
Process fails to send a message that it is supposed to send.

b. Receive Omission
Process fails to read a message that arrived.

Impact: Causes message loss; complicates coordination protocols like mutual exclusion and
consensus.

3. Timing Failures
Process or channel violates timing assumptions.
Examples:

Process takes too long to respond.

Message delay exceeds expected time bound.

Occurs in: Real-time systems, clock drift, unpredictable network latencies.

4. Response Failures

Unit 4 7
Process produces incorrect output.
Two types:

Value failure: Wrong output value.

State transition failure: Process moves to wrong next state.

5. Arbitrary / Byzantine Failures

The most severe failure model.

Process behaves arbitrarily, maliciously, or inconsistently.

Sends conflicting messages to different processes.

Hardest to detect and tolerate.

Requires:

Expensive algorithms like BFT, OM(f), PBFT.

Need n > 3f processes to tolerate f Byzantine failures (oral messages).

6. Network Failures

a. Message corruption
Data modified in transit.

b. Message loss
Packets dropped.

c. Network partition
Network splits into disconnected subsets.

Effect:

Can cause inconsistency, split-brain problems, leader election failure.

7. Security Failures
Unauthorized access.

Data tampering.

Denial-of-service attacks.

Unit 4 8
Man-in-the-middle attacks.

Relevant in cloud and internet-based distributed systems.

Conclusion
Understanding failure types is essential for designing reliable distributed algorithms such as
consensus, checkpointing, replication, and recovery protocols. Different failures require
different tolerance techniques (timeouts for crash, BFT for Byzantine, redundancy for
network issues).

If you want, I can combine all the Part-B

answers for Unit 3 into one fully formatted
exam notes PDF.

Unit 4 9

DC (Unit 4)
No ratings yet
DC (Unit 4)
14 pages
Document 32distributed Computing Concept
No ratings yet
Document 32distributed Computing Concept
16 pages
Consensus and Recovery in Distributed Systems
No ratings yet
Consensus and Recovery in Distributed Systems
60 pages
Oral Message Algorithm in Consensus Systems
No ratings yet
Oral Message Algorithm in Consensus Systems
35 pages
Consensus and Recovery Algorithms Overview
No ratings yet
Consensus and Recovery Algorithms Overview
25 pages
Consensus and Recovery in Distributed Systems
No ratings yet
Consensus and Recovery in Distributed Systems
38 pages
Consensus and Recovery in Distributed Systems
No ratings yet
Consensus and Recovery in Distributed Systems
46 pages
Consensus and Agreement in Distributed Systems
No ratings yet
Consensus and Agreement in Distributed Systems
14 pages
Agreement in Synchronous Systems
No ratings yet
Agreement in Synchronous Systems
27 pages
Distributed Systems Notes Consensus Agreement Recovery
No ratings yet
Distributed Systems Notes Consensus Agreement Recovery
2 pages
Unit 4 2 Marks
No ratings yet
Unit 4 2 Marks
3 pages
Fault Tolerance in Distributed Systems
No ratings yet
Fault Tolerance in Distributed Systems
11 pages
Consensus and Checkpointing in Distributed Systems
No ratings yet
Consensus and Checkpointing in Distributed Systems
20 pages
Consensus Algorithms in Distributed Systems
No ratings yet
Consensus Algorithms in Distributed Systems
32 pages
Understanding Consensus in Distributed Systems
No ratings yet
Understanding Consensus in Distributed Systems
79 pages
Consensus and Recovery in Distributed Systems
No ratings yet
Consensus and Recovery in Distributed Systems
21 pages
Consensus and Recovery in Distributed Systems
No ratings yet
Consensus and Recovery in Distributed Systems
36 pages
Consensus and Recovery in Distributed Systems
No ratings yet
Consensus and Recovery in Distributed Systems
38 pages
Module5 Upload
No ratings yet
Module5 Upload
105 pages
Fault Tolerance Notes
No ratings yet
Fault Tolerance Notes
24 pages
DC m5
No ratings yet
DC m5
10 pages
Consensus Algorithms and Recovery Methods
No ratings yet
Consensus Algorithms and Recovery Methods
81 pages
Consensus Algorithms in Distributed Systems
No ratings yet
Consensus Algorithms in Distributed Systems
41 pages
Byzantine Agreement in Distributed Systems
No ratings yet
Byzantine Agreement in Distributed Systems
19 pages
Understanding Distributed Agreement Concepts
No ratings yet
Understanding Distributed Agreement Concepts
14 pages
DC - Unit Iv - Consensus and Recovery Notes
No ratings yet
DC - Unit Iv - Consensus and Recovery Notes
33 pages
Coordinated Checkpointing in Distributed Systems
No ratings yet
Coordinated Checkpointing in Distributed Systems
110 pages
Consensus Algorithms in Distributed Systems
No ratings yet
Consensus Algorithms in Distributed Systems
30 pages
Consensus Algorithms in Distributed Systems
No ratings yet
Consensus Algorithms in Distributed Systems
10 pages
Coordinated Checkpointing in Recovery
No ratings yet
Coordinated Checkpointing in Recovery
32 pages
Consensus Algorithms in Distributed Systems
No ratings yet
Consensus Algorithms in Distributed Systems
9 pages
Consensus Algorithms in Distributed Systems
No ratings yet
Consensus Algorithms in Distributed Systems
25 pages
Consensus and Recovery Algorithms Explained
No ratings yet
Consensus and Recovery Algorithms Explained
3 pages
Consensus and Recovery in Distributed Systems
No ratings yet
Consensus and Recovery in Distributed Systems
3 pages
System Recovery and Error Management
No ratings yet
System Recovery and Error Management
38 pages
Fault Tolerance in Distributed Systems
No ratings yet
Fault Tolerance in Distributed Systems
39 pages
DC Exam
No ratings yet
DC Exam
25 pages
KTU Distributed Computing M5 Notes
No ratings yet
KTU Distributed Computing M5 Notes
11 pages
Consensus Algorithms in Faulty Systems
No ratings yet
Consensus Algorithms in Faulty Systems
3 pages
DC Unit 4 Book - PDF On Distributed Computing
No ratings yet
DC Unit 4 Book - PDF On Distributed Computing
33 pages
DC - Unit4
No ratings yet
DC - Unit4
46 pages
Fault-Tolerant Consensus Algorithms
No ratings yet
Fault-Tolerant Consensus Algorithms
48 pages
Reliable Distributed Programming Overview
No ratings yet
Reliable Distributed Programming Overview
55 pages
Recovery and Consensus in Distributed Systems
No ratings yet
Recovery and Consensus in Distributed Systems
33 pages
Understanding Distributed Algorithms
No ratings yet
Understanding Distributed Algorithms
23 pages
Consensus and Agreement Algorithms Overview
No ratings yet
Consensus and Agreement Algorithms Overview
25 pages
Dis Notes 4
No ratings yet
Dis Notes 4
31 pages
Recovery in Concurrent Systems
No ratings yet
Recovery in Concurrent Systems
9 pages
Consensus and Recovery in Distributed Systems
No ratings yet
Consensus and Recovery in Distributed Systems
27 pages
Distributed Consensus Overview by Angela Brown
No ratings yet
Distributed Consensus Overview by Angela Brown
35 pages
Lamport-Shostak-Pease Agreement Protocol
No ratings yet
Lamport-Shostak-Pease Agreement Protocol
17 pages
Module 5 - Consensus and Distributed File System - Sreerag Sanilkumar
No ratings yet
Module 5 - Consensus and Distributed File System - Sreerag Sanilkumar
24 pages
Consensus Algorithms & Distributed File Systems
No ratings yet
Consensus Algorithms & Distributed File Systems
46 pages
Entrance Test Guidelines for English Honours
No ratings yet
Entrance Test Guidelines for English Honours
53 pages
BA-RA 7941 vs. COMELEC Case Overview
No ratings yet
BA-RA 7941 vs. COMELEC Case Overview
2 pages
CH 03
No ratings yet
CH 03
12 pages
Politeness Strategies in Harry Potter
No ratings yet
Politeness Strategies in Harry Potter
16 pages
English Language Literature Answer Key
No ratings yet
English Language Literature Answer Key
6 pages
Accounting Information System Course Outline
No ratings yet
Accounting Information System Course Outline
4 pages
Oral Language Activities Lesson Plan
100% (1)
Oral Language Activities Lesson Plan
5 pages
FortiGate Web Filtering Configuration Guide
No ratings yet
FortiGate Web Filtering Configuration Guide
95 pages
Lab12 Data Validation Assignment
No ratings yet
Lab12 Data Validation Assignment
3 pages
Second Term Exam 1A Instructions
No ratings yet
Second Term Exam 1A Instructions
13 pages
Analyzing "Imagination" and "You Raise Me Up"
No ratings yet
Analyzing "Imagination" and "You Raise Me Up"
4 pages
BCS402 Module 2: ARM Instruction Set
No ratings yet
BCS402 Module 2: ARM Instruction Set
28 pages
Roman Engraved Gems in Lisbon Museum
No ratings yet
Roman Engraved Gems in Lisbon Museum
74 pages
AS/400 Message Handling Overview
100% (1)
AS/400 Message Handling Overview
38 pages
Understanding Business Conditionals
No ratings yet
Understanding Business Conditionals
8 pages
Global Bedbug Outbreak Insights
100% (1)
Global Bedbug Outbreak Insights
7 pages
Contoh Tanaman Palawija dan Teknik Perbanyakan
No ratings yet
Contoh Tanaman Palawija dan Teknik Perbanyakan
4 pages
Class 11 Computer Science Practical Guide
No ratings yet
Class 11 Computer Science Practical Guide
26 pages
Basant Panchami Celebration in Punjabi
100% (1)
Basant Panchami Celebration in Punjabi
3 pages
Art Forms of Kerala - 20250707 - 111307 - 0000
No ratings yet
Art Forms of Kerala - 20250707 - 111307 - 0000
18 pages
CRS326 24G 2S qgv3
No ratings yet
CRS326 24G 2S qgv3
3 pages
Fidge Louis Primary Foundation Skills Reading Comprehension 1 2
No ratings yet
Fidge Louis Primary Foundation Skills Reading Comprehension 1 2
70 pages
Revitalization of Endangered Languages - Quechua in The Andes
No ratings yet
Revitalization of Endangered Languages - Quechua in The Andes
9 pages
HDFC Bank Clerical Test Questions
89% (9)
HDFC Bank Clerical Test Questions
9 pages
Advantages of CASE Tools in Software Engineering
No ratings yet
Advantages of CASE Tools in Software Engineering
53 pages
Pronunciation Guide for Tech Terms
No ratings yet
Pronunciation Guide for Tech Terms
5 pages
English File: 9 Quick Test
0% (1)
English File: 9 Quick Test
2 pages
Weekly Schedule of Appointments
No ratings yet
Weekly Schedule of Appointments
3 pages
Memory and Disk Management in DBMS
No ratings yet
Memory and Disk Management in DBMS
6 pages
General English 2018
No ratings yet
General English 2018
31 pages

Key Assumptions in Agreement Algorithms

Uploaded by

Key Assumptions in Agreement Algorithms

Uploaded by

Unit 4

4. What are the key assumptions underlying

1. Timing model – synchronous / asynchronous / partially synchronous.

2. Fault model – crash, omission, Byzantine.

3. Communication model – reliable channels, FIFO / non-FIFO, message delays.

4. Process behavior – deterministic execution.

6. Authentication assumptions – signatures, no message forgery, etc.

1. Timing Model Assumptions

Easier to design consensus (e.g., phase-based, round-based algorithms).

FLP impossibility: deterministic consensus impossible with even 1 crash failure.

Partially synchronous systems:

System behaves synchronously after some unknown global stabilization time.

Used in practical BFT protocols like PBFT.

2. Fault Model Assumptions

Crash failures: Process stops and does nothing further.

Omission failures: Fails to send/receive some messages.

Byzantine failures: Arbitrary misbehavior — sending conflicting or corrupted messages.

Bound on failures: Maximum f faulty processes must be known in advance.

E.g., Byzantine agreement requires n > 3f.

3. Communication Model Assumptions

No forgery: one process cannot impersonate another.

4. Process Behavior Assumptions

Shared algorithm logic — all processes follow identical code.

Use of randomization (for randomized consensus) is explicitly allowed in some models.

Upper bound on faults ( f ).

Message authentication codes,

Cryptographic hash chains, etc.

5. Construct the rollback recovery process using

Coordinated checkpointing: All processes take checkpoints in a globally consistent

Avoids domino effect and orphan messages.

Works even with frequent failures.

A. Rollback Recovery Using Coordinated Checkpointing

2. Quiescence or Blocking Phase:

After checkpointing, each process sends “CHECKPOINT DONE” to the coordinator.

If all processes respond correctly, the coordinator broadcasts a COMMIT

5. Abort (if failure occurs):

Recovery is quick — restart from last global checkpoint.

High coordination overhead.

B. Juang–Venkatesan Asynchronous Checkpointing

Message dependency tracking,

Checkpoint initiation tokens,

Every message carries the sender’s dependency vector.

If a process receives a message from a sender whose checkpoint is earlier, it also

The set of checkpoints obtained is globally consistent.

Eliminates inconsistent states (no missing messages).

No blocking required — useful for high-availability systems.

Advantages over coordinated checkpointing

Better suited for long-running or real-time distributed systems.

Handles frequent failures better.

6. Illustrate in detail the different types of

2. Omission Failure (Send/Receive)

5. Network Failure (Partitioning)

6. Response & Value Failure

Most common and easiest to handle.

Example: process stops due to power outage.

Detection: Timeout-based heartbeat monitoring.

Process takes too long to respond.

Message delay exceeds expected time bound.

Occurs in: Real-time systems, clock drift, unpredictable network latencies.

Value failure: Wrong output value.

State transition failure: Process moves to wrong next state.

5. Arbitrary / Byzantine Failures

Process behaves arbitrarily, maliciously, or inconsistently.

Sends conflicting messages to different processes.

Hardest to detect and tolerate.

Expensive algorithms like BFT, OM(f), PBFT.

Need n > 3f processes to tolerate f Byzantine failures (oral messages).

Can cause inconsistency, split-brain problems, leader election failure.

Relevant in cloud and internet-based distributed systems.

If you want, I can combine all the Part-B

You might also like