0% found this document useful (0 votes)

9 views29 pages

Understanding PBFT in Distributed Systems

The document provides an overview of distributed consensus, focusing on algorithms like Paxos, Raft, and Practical Byzantine Fault Tolerance (PBFT). It explains the mechanisms of these algorithms, their advantages, and the challenges they address in achieving consensus in distributed systems. Additionally, it discusses the CAP theorem, which highlights the limitations in simultaneously achieving consistency, availability, and partition tolerance in distributed systems.

Uploaded by

Hosam Mohammed

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views29 pages

Understanding PBFT in Distributed Systems

Uploaded by

Hosam Mohammed

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

Distributed

Consensus
Supervisor

Dr. Ahmed Salem

Team Name 0 Abdullah Helmy Rashad
1
Innovision

02 Mahmoud Ibrahim Kamal

03 Abdelrahman Mohamed
Leve
Team Work l
Four ( 4 )
Abdelgwad

04 Ahmed Mahmoud Ahmed

Ibrahim

05 Abdelrahman Atef Mohamed

Course
Distributed 06 Abdelghfar Salah Abdelghfar
Systems
Definition
 Distributed consensus is a process used in distributed
systems to ensure that multiple nodes (computers or processes)
in a network agree on a single data value or decision.
Consensus Algorithms
 Paxos: A highly influential algorithm designed to achieve consensus in
a fault-tolerant manner.
 Raft: Simplifies Paxos and is widely used in systems like etcd.
 Byzantine Fault Tolerance (BFT): Handles malicious or faulty nodes,
used in systems like blockchain.
Paxos
 Paxos is a fault-tolerant distributed consensus algorithm
designed to enable a group of nodes to agree on a single
value, even in the presence of failures.
 Paxos Essentials
• We assume a client-server configuration, with initially one primary
server.
• In Paxos, the primary server is called the leader.
• To make the server more robust, we start with adding a backup server.
• To ensure that all commands are executed in the same order at both
servers, the primary server assigns unique sequence numbers to all
commands.
Paxos
 Some Paxos Terminology
• The leader sends an accept message ACCEPT(o, t) to backups when
assigning a timestamp t to command o
• A backup responds by sending a learn message: LEARN(o, t)
• When the leader notices that operation o has not yet been learned, it
retransmits ACCEPT(o, t) with the original timestamp.
Paxos
 Two-server situation
Paxos
 Two servers and one crash: problem

 Problem
• Primary crashes after executing an operation, but the backup never received
the accept message.
Paxos
 Two servers and one crash: solution

 Solution
• Never execute an operation before it is clear that is has been learned.
Paxos
 Reliable failure detection is practically impossible. A solution is to set timeouts,
but take into account that a detected failure may be false.
Paxos
 Observation
• Paxos needs at least three servers.

 Adapted fundamental rule

• In Paxos with three servers, a server S cannot execute an operation o until it
has received at least one (other) LEARN(o) message, so that it knows that a
majority of servers will execute o.
Paxos
 Three servers and two crashes: still a problem

 Scenario
• What happens when LEARN(o1) as sent by S2 to S1 is lost?
 General rule
• In Paxos, a server S cannot execute an operation o until it has received a LEARN(o)
from all other nonfaulty servers.
Paxos
 Why having 3k processes is not enough.
Paxos
 Why having 3k + 1 processes is enough.
Raft
 It is a consensus algorithm used in distributed systems to ensure
consistency of data and provide service even in the event of
breakdowns. Raft is especially designed to be easier to understand
and implement compared to other consensus algorithms such as
Paxos.
Why Raft Algorithm ? 09/25

Raft's Advantages: Simplicit

y
 Simplicity and understandability. Practical
implementation
 Practical implementation in distributed systems. Reliable
Distribut
 Strong consistency guarantees. Strong ed
consisten consens
 Flexible: Able to handle server breakdowns and cy us

change leaders easily. Flexible

How does the raft work ?
1. Leader election:
• The algorithm begins by electing a single HB

server to be the leader.

• This leader is responsible for making Followers
decisions and coordinating work among
Leader
the other servers.

(Note)
The leader in Raft provides centralized control, prevents conflicts,
improves performance, and ensures fault tolerance in distributed
systems. (*HB: heartbeat )

See how RAFT works

How does the raft work ?
• If leader is dead
Timeout
Candidate Timeout = 120ms
err(leader dead)
node Log length = 4

Candidate Timeout Timeout = 110ms

Leader node err(leader dead)
Log length = 4

Candidate Timeout Candidate

node err(leader dead) node Timeout = 100ms
Log length = 4

Candidate Timeout Candidate Timeout = 100ms

node err(leader dead) node
Log length = 2

Term:1 Term:2
How does the raft
work ?
2. Registration of Operations:
• The leader records all operations (such as adding a new
item, updating value, etc.) in a Log. Follower

• This record is copied to all other servers. nd

a
m
m
co a nd
m m Follower
co

comma
CLIENT
Request LEADER nd
Follower

comm
and

Follower

(Note)
If the follower receives a request, he forwards it to the leader.
How does the Raft
work?

3. Agreement on Operations:
• The process cannot be finalized until a majority of servers
have agreed to it.
Follower
ri es
Steps: endE nt
App
Sending: The leader broadcasts the log entry
to all followers. AppendEntries Follower
LEADER
Processing: Followers verify the log entry and
update their logs if valid, then acknowledge the Appe
leader. ndE ntrie
s Follower
Confirmation: The leader commits the log
entry once a majority of followers
acknowledge.
How does the raft
work?

[Link] to operations:
• Once all servers agree on a process, this process is
committed and cannot be undone
Paxos vs Raft

Aspect Paxos Raft

Simpler to understand and
Simplicity More complex and less intuitive
implement

Leader Election Implicit leader election Explicit leader election process

More complex log replication

Log Replication Leader-centric log replication
mechanism
Can be less efficient due to its Generally considered faster
Performance
complexity and more efficient
Practical Byzantine Fault Tolerance (PBFT)
 Refers to the ability of a distributed system to achieve consensus even if some
of the nodes exhibit Byzantine faults

 Assumptions
• A server may exhibit arbitrary failures
• Messages may be lost, delayed, and received out of order
• Messages have an identifiable sender (i.e., they are signed)
• Partially synchronous execution model

 Note
• A primary-backup approach with 3k + 1 replica servers
PBFT: four phases

• C is the client
• P is the primary
• B1, B2 , B3 are backups
• Assume B2 is faulty
PBFT: four phases(PRE-PREPARE)

• C requests operation o to be executed

• P timestamps o and sends PRE-PREPARE
• Backup Bi accepts the pre-prepare message if it is also is in v and has
not accepted a an operation with timestamp t before.
PBFT: four phases(PREPARE)

• Bi broadcasts PREPARE(t, v, o) to all (including the primary)

• Note: a nonfaulty server will eventually log 2k messages PREPARE(t, v, o)
(including its own) ⇒ consensus on the ordering of o.
• Note: it doesn’t matter what faulty B2 sends, it cannot affect joint
decisions by P, B1, B3 .
PBFT: four phases(COMMIT)

• All servers broadcast COMMIT(t, v, o)

• The commit is needed to also make sure that o can be executed now,
that is, in the current view v .
• When 2k messages have been collected, excluding its own, the server
can safely execute o then reply to the client.
CAP Theorem

 The CAP theorem states that a distributed computer system cannot guarantee
all of the following three properties at the same time.

• C: consistency, by which a shared and replicated data item appears as a

single, up-to-date copy
• A: availability, by which updates will always be eventually executed
• P: Tolerant partitioning of process group.

 Why can't the three characteristics be achieved together?

• The main reason is that these properties conflict with each other. For example, if we
want to achieve full consistency, we may have to delay some operations until we
make sure that all contracts are agreed on the same situation, which affects
availability.
Final
Slide

Thank You

Common questions

Paxos has an implicit leader election process, often considered more complex, leading to its reputation as a challenging algorithm for understanding leader coordination. Raft, on the other hand, uses an explicit leader election, simplifying the process by establishing clear leader responsibilities. In terms of log replication, Paxos uses a more complex mechanism, whereas Raft is leader-centric, where the leader replicates logs to followers, ensuring simplicity and speed in committing operations upon achieving a majority agreement .

Distributed consensus algorithms such as Paxos and Raft require more than a single server to function effectively because these algorithms rely on achieving consensus among nodes, particularly in fault tolerance scenarios. Paxos expressly requires at least three servers to handle failures and maintain system operations through majority agreements on operations, which secure resilience in the face of node or communication failures. Additionally, multiple servers ensure that there is a fallback mechanism in case the leader fails, enabling seamless continuation and coordination of operations .

Paxos addresses the potential issue of a primary server crash, which may occur after executing an operation without the backup server receiving the accept message, by ensuring that an operation is never executed before it is confirmed as learned. This means that the operation execution is delayed until the leader is certain that the majority of servers have agreed to execute the operation. This is accomplished by receiving 'LEARN' messages from other servers, indicating acknowledgment from a majority .

The separation of phases in PBFT, namely Pre-Prepare, Prepare, and Commit, enhances its capacity to handle Byzantine faults by structuring the consensus process such that each phase provides successive levels of validation. Each phase requires a certain number of node agreements, mitigating the impact of any faulty nodes on consensus decisions. For example, the Prepare phase involves receiving 2k PREPARE messages to ensure that honest nodes agree on the operation's timestamp, while the Commit phase requires 2k COMMIT messages, including no faulty influences, before execution. This modular approach ensures rigorous agreement and prevents faulty nodes from compromising the system's integrity .

In Paxos, reliable failure detection is practically addressed using timeouts to assume a failure has occurred when expected responses are not received. However, this approach has inherent limitations, such as the possibility of false positives, where a timeout implies failure even if the server is running correctly but experiencing temporary network delays. This highlights the critical challenge of adjusting timeout durations wisely to mitigate incorrect failure detection while maintaining quick responsiveness to actual failures .

In the Raft algorithm, the leader node has central control responsibilities, coordinating tasks such as logging operations and ensuring followers replicate these logs. The leader ensures consistency and handles incoming requests. When a leader fails, followers detect the absence of heartbeats within a timeout period, triggering the election of a new leader. Candidate nodes solicit votes from followers, and the candidate with a majority wins. This rapid leader replacement ensures system robustness and continuity of operations without prolonged downtime .

Raft offers several advantages over Paxos, particularly in terms of simplicity and implementation. Raft is designed to be easier to understand, featuring a clear and explicit leader election process which improves system performance and fault tolerance. Its leader-centric log replication simplifies its model compared to Paxos, aiding in faster processing and decision-making. Raft's straightforward design enables more practical implementations in real-world distributed systems, making it more accessible for developers to implement and maintain .

The CAP theorem has a significant influence on the design of distributed systems by emphasizing that a system cannot simultaneously guarantee consistency, availability, and partition tolerance. Designing a system involves trading off between these properties. For instance, ensuring consistency often requires delaying availability during partition-tolerant scenarios, thus affecting system responsiveness. Similarly, prioritizing availability may lead to inconsistencies during network partitions. These limitations or trade-offs challenge designers to decide which aspects are most critical based on their specific application requirements .

The key phases of PBFT are Pre-Prepare, Prepare, and Commit. In the Pre-Prepare phase, the primary node timestamps the client's request and sends a PRE-PREPARE message to backups, ensuring the initial agreement on processing. In the Prepare phase, each backup broadcasts a PREPARE message to confirm receipt and agreement on the operation timing, helping form a consensus despite Byzantine faults. Finally, in the Commit phase, all nodes broadcast a COMMIT message to finalize and execute the operation once 2k COMMIT messages are collected, ensuring that the operation can be safely executed .

In PBFT, having '3k + 1' replicas is crucial because it ensures that the system can tolerate up to 'k' faults. The setup is designed so that while 'k' replicas can fail, the remaining '2k + 1' honest replicas can still reach a consensus. This configuration provides resilience against Byzantine faults where nodes can fail or act maliciously. By exceeding simple majority requirements, PBFT ensures that the honest replicas can override the bad or broken nodes, maintaining system integrity and reliability .

Understanding Raft and Paxos Algorithms
No ratings yet
Understanding Raft and Paxos Algorithms
23 pages
Understanding Raft and Paxos Algorithms
No ratings yet
Understanding Raft and Paxos Algorithms
24 pages
Consensus Algorithms in Distributed Systems
No ratings yet
Consensus Algorithms in Distributed Systems
34 pages
Mod 3 - 1
No ratings yet
Mod 3 - 1
37 pages
Consensus Algorithms in Distributed Systems
No ratings yet
Consensus Algorithms in Distributed Systems
23 pages
Consensus Algorithms: Paxos, Raft, PBFT
No ratings yet
Consensus Algorithms: Paxos, Raft, PBFT
33 pages
Raft: An Understandable Consensus Algorithm
No ratings yet
Raft: An Understandable Consensus Algorithm
68 pages
Block Mod 3
No ratings yet
Block Mod 3
108 pages
Distributed Consensus in Systems Explained
No ratings yet
Distributed Consensus in Systems Explained
8 pages
Understanding Consensus in Fault Tolerance
No ratings yet
Understanding Consensus in Fault Tolerance
12 pages
Distributed Systems and Consensus Algorithms
No ratings yet
Distributed Systems and Consensus Algorithms
21 pages
Understanding Distributed Consensus in Blockchain
No ratings yet
Understanding Distributed Consensus in Blockchain
20 pages
Consensus Algorithms in Distributed Systems
No ratings yet
Consensus Algorithms in Distributed Systems
2 pages
Hyperledger Fabric Implementation Guide
No ratings yet
Hyperledger Fabric Implementation Guide
58 pages
Understanding the Raft Consensus Algorithm
No ratings yet
Understanding the Raft Consensus Algorithm
8 pages
Understanding Distributed Consensus Algorithms
No ratings yet
Understanding Distributed Consensus Algorithms
6 pages
Understanding the Raft Algorithm
No ratings yet
Understanding the Raft Algorithm
15 pages
Understanding the Raft Consensus Algorithm
No ratings yet
Understanding the Raft Consensus Algorithm
29 pages
Understanding Permissioned Blockchains
No ratings yet
Understanding Permissioned Blockchains
62 pages
Understanding Distributed Agreement Concepts
No ratings yet
Understanding Distributed Agreement Concepts
14 pages
Viewstamped Replication vs Paxos
No ratings yet
Viewstamped Replication vs Paxos
34 pages
Blockchain Module3 KTU
No ratings yet
Blockchain Module3 KTU
28 pages
Reliable Geo-Distributed RPC for Solana
No ratings yet
Reliable Geo-Distributed RPC for Solana
37 pages
Cloud Computing: Atomic Actions & Consensus
No ratings yet
Cloud Computing: Atomic Actions & Consensus
25 pages
Consensus Algorithms in Blockchain
No ratings yet
Consensus Algorithms in Blockchain
75 pages
LogDevice: Consensus and Flexibility
No ratings yet
LogDevice: Consensus and Flexibility
56 pages
Consensus in Distributed Systems Explained
No ratings yet
Consensus in Distributed Systems Explained
4 pages
Understanding Raft Consensus Protocol
No ratings yet
Understanding Raft Consensus Protocol
11 pages
Raft Consensus Algorithm Explained
No ratings yet
Raft Consensus Algorithm Explained
15 pages
BCT M3
No ratings yet
BCT M3
26 pages
Raft Consensus Algorithm Overview
No ratings yet
Raft Consensus Algorithm Overview
11 pages
RAFT and Paxos Consensus Algorithms Explained
No ratings yet
RAFT and Paxos Consensus Algorithms Explained
9 pages
May 2024 Endsem DC Solution
No ratings yet
May 2024 Endsem DC Solution
33 pages
Overview of Paxos Consensus Algorithm
No ratings yet
Overview of Paxos Consensus Algorithm
26 pages
Comparing Paxos and Raft Consensus
No ratings yet
Comparing Paxos and Raft Consensus
8 pages
Distributed Consensus Algorithms Explained
No ratings yet
Distributed Consensus Algorithms Explained
14 pages
Understanding Ledgers and Consensus Mechanisms
No ratings yet
Understanding Ledgers and Consensus Mechanisms
36 pages
Distributed Systems: Faults & Recovery
No ratings yet
Distributed Systems: Faults & Recovery
32 pages
Raft Consensus in Blockchain Systems
No ratings yet
Raft Consensus in Blockchain Systems
9 pages
Distributed Systems: Faults & Recovery
No ratings yet
Distributed Systems: Faults & Recovery
34 pages
Module 3
No ratings yet
Module 3
55 pages
A Generalised Solution To Distributed Consensus
No ratings yet
A Generalised Solution To Distributed Consensus
20 pages
Understanding Distributed Consensus Algorithms
No ratings yet
Understanding Distributed Consensus Algorithms
4 pages
CAP Theorem and Consistency Models Explained
No ratings yet
CAP Theorem and Consistency Models Explained
15 pages
Distributed Algorithms Overview and Use Cases
No ratings yet
Distributed Algorithms Overview and Use Cases
15 pages
Understanding Atomic Multicast in Systems
No ratings yet
Understanding Atomic Multicast in Systems
34 pages
Understanding Consensus Algorithms
No ratings yet
Understanding Consensus Algorithms
12 pages
Distributed Computing Algorithms Overview
No ratings yet
Distributed Computing Algorithms Overview
44 pages
Raft Consensus Algorithm Overview
No ratings yet
Raft Consensus Algorithm Overview
30 pages
Module II Clock Synchronization
No ratings yet
Module II Clock Synchronization
9 pages
Byzantine Fault Tolerance in Blockchain
No ratings yet
Byzantine Fault Tolerance in Blockchain
13 pages
Understanding Blockchain Consensus Mechanisms
No ratings yet
Understanding Blockchain Consensus Mechanisms
22 pages
Blockchain PDF /PPT Understand Basics of It
No ratings yet
Blockchain PDF /PPT Understand Basics of It
61 pages
Comparing Paxos and ZooKeeper Systems
No ratings yet
Comparing Paxos and ZooKeeper Systems
10 pages
Consensus Mechanism in Blockchain: UNIT-1
No ratings yet
Consensus Mechanism in Blockchain: UNIT-1
12 pages
Clock Synchronization in Distributed Systems
No ratings yet
Clock Synchronization in Distributed Systems
9 pages
DS Unit 3 Notes
No ratings yet
DS Unit 3 Notes
62 pages
Programming Functions and Games Guide
No ratings yet
Programming Functions and Games Guide
2 pages
Regression
No ratings yet
Regression
32 pages
Processes and Threads Overview
No ratings yet
Processes and Threads Overview
26 pages
Understanding Operating System Processes
No ratings yet
Understanding Operating System Processes
47 pages
Shorouk Academy Project Highlights 2019
No ratings yet
Shorouk Academy Project Highlights 2019
10 pages
The Solitary Lighthouse Keeper's Companion
No ratings yet
The Solitary Lighthouse Keeper's Companion
1 page
Hospital ERD Overview and Structure
No ratings yet
Hospital ERD Overview and Structure
1 page
Elara's Magical Elixir of Joy
No ratings yet
Elara's Magical Elixir of Joy
1 page
BS in Interactive Entertainment Computing
No ratings yet
BS in Interactive Entertainment Computing
13 pages
Steps in Computer Science Problem Solving
No ratings yet
Steps in Computer Science Problem Solving
11 pages
ISO - Standards
No ratings yet
ISO - Standards
4 pages
Avalon CL Fetal Monitoring Service Guide
No ratings yet
Avalon CL Fetal Monitoring Service Guide
138 pages
Readings - Antenna Matching For The TRF7960 RFID
No ratings yet
Readings - Antenna Matching For The TRF7960 RFID
8 pages
SQL Basics: Understanding Databases and CRUD
No ratings yet
SQL Basics: Understanding Databases and CRUD
4 pages
2SB1481 PNP Transistor Datasheet
No ratings yet
2SB1481 PNP Transistor Datasheet
4 pages
Understanding Artificial Intelligence Basics
No ratings yet
Understanding Artificial Intelligence Basics
14 pages
Smart Garden Lights Catalogue
No ratings yet
Smart Garden Lights Catalogue
9 pages
DSA & System Design Roadmap in JS
No ratings yet
DSA & System Design Roadmap in JS
4 pages
Test Bank for Ethics in IT, 6th Ed.
No ratings yet
Test Bank for Ethics in IT, 6th Ed.
11 pages
Cortex-M Addressing Modes and Instruction Set
100% (1)
Cortex-M Addressing Modes and Instruction Set
10 pages
Library Management System Project
No ratings yet
Library Management System Project
20 pages
Statistics Module 9 12 Vle
No ratings yet
Statistics Module 9 12 Vle
11 pages
Food Storage and Kitchen Layout Guide
No ratings yet
Food Storage and Kitchen Layout Guide
4 pages
AI-Driven Research for Battery Materials
No ratings yet
AI-Driven Research for Battery Materials
7 pages
Online Medicine Shopping Platform Overview
100% (1)
Online Medicine Shopping Platform Overview
3 pages
Mastering Excel Formula Criteria
No ratings yet
Mastering Excel Formula Criteria
39 pages
Managing Kubernetes Namespaces
No ratings yet
Managing Kubernetes Namespaces
10 pages
UI/UX Design Course Syllabus 2023
100% (1)
UI/UX Design Course Syllabus 2023
2 pages
CBSE Class 12 IP Answer Key 2025
No ratings yet
CBSE Class 12 IP Answer Key 2025
5 pages
Laptop Battery Specifications and Links
No ratings yet
Laptop Battery Specifications and Links
11 pages
IoT-Based Air Quality Monitoring System
No ratings yet
IoT-Based Air Quality Monitoring System
21 pages
Roblox Graphics and Performance Settings
No ratings yet
Roblox Graphics and Performance Settings
7 pages
Understanding DBSCAN Clustering
No ratings yet
Understanding DBSCAN Clustering
29 pages
Free Student Attendance System Download
No ratings yet
Free Student Attendance System Download
17 pages
Motor Feedback System For Internally and Externally Ventilated Drives Sincos®Srs/Srm 64 With Hiperface®
No ratings yet
Motor Feedback System For Internally and Externally Ventilated Drives Sincos®Srs/Srm 64 With Hiperface®
12 pages
Huawei FastConnect FAT2802 Overview
No ratings yet
Huawei FastConnect FAT2802 Overview
4 pages
DR4088LN Manual
No ratings yet
DR4088LN Manual
15 pages
Amadeus Remote Ticketing User Guide
No ratings yet
Amadeus Remote Ticketing User Guide
56 pages

Understanding PBFT in Distributed Systems

Uploaded by

Understanding PBFT in Distributed Systems

Uploaded by

Distributed

Dr. Ahmed Salem

02 Mahmoud Ibrahim Kamal

04 Ahmed Mahmoud Ahmed

05 Abdelrahman Atef Mohamed

 Adapted fundamental rule

Raft's Advantages: Simplicit

change leaders easily. Flexible

server to be the leader.

See how RAFT works

Candidate Timeout Timeout = 110ms

Candidate Timeout Candidate

Candidate Timeout Candidate Timeout = 100ms

• This record is copied to all other servers. nd

Aspect Paxos Raft

Leader Election Implicit leader election Explicit leader election process

More complex log replication

• C requests operation o to be executed

• Bi broadcasts PREPARE(t, v, o) to all (including the primary)

• All servers broadcast COMMIT(t, v, o)

• C: consistency, by which a shared and replicated data item appears as a

 Why can't the three characteristics be achieved together?

Common questions

Compare the functionality of Paxos and Raft algorithms in terms of leader election and log replication.

Why do distributed consensus algorithms like Paxos and Raft require more than a single server to function effectively?

How does the Paxos algorithm address the potential issue of a primary crash without delivering an accept message to a backup server?

In what way does the separation of phases in PBFT enhance its capacity to handle Byzantine faults effectively?

How is reliable failure detection in Paxos practically addressed, and what are the inherent limitations of this approach?

Explain the role of leader and follower nodes in the Raft algorithm and the mechanism for handling leader failures.

What are the advantages of using the Raft consensus algorithm over Paxos in distributed systems?

How does the CAP theorem influence the design of distributed systems, and what challenges does it pose?

What are the key phases of Practical Byzantine Fault Tolerance (PBFT) and their purpose in achieving consensus?

What is the significance of having ‘3k + 1’ replicas in Practical Byzantine Fault Tolerance (PBFT), and how does it relate to system resilience?

You might also like