UNIT IV
RECOVERY & CONSENSUS
Checkpointing and rollback recovery: Introduction – Background and definitions – Issues in failure recovery –
Checkpoint-based recovery – Log-based rollback recovery – Coordinated checkpointing algorithm – Algorithm
for asynchronous checkpointing and recovery. Consensus and agreement algorithms: Problem definition –
Overview of results – Agreement in a failure –free system – Agreement in synchronous systems with failures.
CHECKPOINTING AND ROLLBACK RECOVERY
The saved state is called a checkpoint, and the procedure of restarting from a previously checkpointed state is
called rollback recovery.
Explain Checkpoint-based recovery in detail.
Checkpoint-based recovery
In check point based recovery, the state of each process and the communication channel is checkpointed
frequently so that when a failure occurs, the system can be restored to a globally consistent set of checkpoints.
The three types of rollback-recovery techniques are:
1. Uncoordinated checkpointing
2. Coordinated checkpointing
3. Communication-induced checkpointing
Uncoordinated Checkpointing
Here, each process has autonomy in deciding when to take checkpoints.
This eliminates the synchronization overhead as there is no need for coordination between processes
and it allows processes to take checkpoints when it is most convenient or efficient.
Advantages
Lower runtime overhead during normal execution.
Limitations
Domino effect during a recovery
Recovery from a failure is slow.
Each process maintains multiple checkpoints and periodically invoke a garbage collection algorithm
Not suitable for application with frequent output commits
Coordinated checkpointing
Coordinated checkpointing requires each process to maintain only one checkpoint on the stable storage,
reducing the storage overhead and eliminating the need for garbage collection.
There are two types of coordinated checkpoints:
Blocking Checkpointing
Non-blocking Checkpointing
Blocking Checkpointing
After a process takes a local checkpoint, to prevent orphan messages, it remains blocked until the entire
checkpointing activity is complete.
The coordinator takes a checkpoint and broadcasts a request message to all processes, asking them to
take a checkpoint.
When a process receives this message, it stops its execution, takes a tentative checkpoint.
It sends an acknowledgment message back to the coordinator.
After the coordinator receives acknowledgments from all processes, it broadcasts a commit message to
all processes.
After receiving the commit message, atomically makes the tentative checkpoint permanent and then
resumes its execution .
Non-blocking Checkpointing
The processes need not stop their execution while taking checkpoints.
Example:
Message m is sent by P0 after receiving a checkpoint request from the checkpoint coordinator. Assume m
reaches P1 before the checkpoint request. This situation results in an inconsistent checkpoint. Since checkpoint
c1,x shows the receipt of message m from P0, while checkpoint c0,x does not show m being sent from P0. To solve
inconsistent checkpoint, on-blocking checkpoint coordination protocol using this snapshot algorithm of Chandy
and Lamport in which markers play the role of the checkpoint request messages.
Communication-induced checkpointing
Communication-induced checkpointing avoids the domino effect, while allowing processes to take
some of their checkpoints independently.
In communication-induced checkpointing, processes take two types of checkpoints.
The checkpoints that a process takes independently are called local checkpoints.
The process is forced to take are called forced checkpoints.
Two types of communication-induced checkpointing:
1. Model based checkpointing
This prevents patterns of communications and checkpoints that may result in inconsistent states among
the existing checkpoints.
This model can be maintained by taking an additional checkpoint before every message-receiving event
that is not separated from its previous message-sending event by a checkpoint.
Another method is by taking a checkpoint immediately after every message sending event.
2. Index-based checkpointing
This assigns monotonically increasing indexes to checkpoints, such that the checkpoints having the same index
at different processes form a consistent state.
Explain Log-based rollback recovery in detail.
Log-based rollback recovery
A log-based rollback recovery makes use of deterministic and nondeterministic events in a computation.
Deterministic and non-deterministic events
A non-deterministic event can be the receipt of a message from another process or an event internal to
the process.
Message send event is not a non-deterministic event.
For example, in Figure, the execution of process P0 is a sequence of four deterministic intervals. The first one
starts with the creation of the process, while the remaining three start with the receipt of messages m0, m3, and
m7, respectively. Send event of message m2 is uniquely determined by the initial state of P0 and by the receipt
of message m0, and is therefore not a non-deterministic event
Pessimistic logging
Pessimistic logging protocols assume that a failure can occur after any nondeterministic event in the
computation.
Pessimistic protocols implement as synchronous logging.
The processes must take periodic checkpoints to minimize the amount of work that has to be repeated
during recovery.
When a process fails, the process is restarted from the most recent checkpoint and the logged
determinants are used to recreate the prefailure execution.
Consider the example in Figure. During failure-free operation the logs of processes P0, P1, and P2 contain
the determinants needed to replay messages m0, m4, m7, m1, m3, m6, and m2, m5, respectively. Suppose
processes P1 and P2 fail as shown, restart from checkpoints B and C, and roll forward using their
determinant logs to deliver again the same sequence of messages. This guarantees that P1 and P2 will repeat
exactly their pre-failure execution and re-send the same messages.
Optimistic logging
In these protocols, processes log determinants asynchronously to the stable storage.
Optimistic logging protocol assume that logging will be complete before a failure occurs.
Pessimistic protocols need only keep the most recent checkpoint of each process, whereas optimistic
protocols may need to keep multiple checkpoints for each process.
The overheads in optimistic logging are complicated recovery, garbage collection, and slower output
commit.
Consider the example shown in Figure. Suppose process P2 fails before the m5 is logged to the stable storage.
Process P1 then becomes an orphan process and must roll back to undo the effects of receiving the orphan
message m6. The rollback of P1 further forces P0 to roll back to undo the effects of receiving message m7.
Casual Logging
This combines the advantages of both pessimistic and optimistic logging.
Like optimistic logging, it does not require synchronous access to the stable storage.
Like pessimistic logging, it allows each process to commit output independently and never creates
orphans.
Consider the example in Figure . Messages m5 and m6 are likely to be lost on the failures of P1 and P2. process
P0 will be able to “guide” the recovery of P1 and P2 since it knows the order in which P1 should replay
messages. Similarly, P0 has the order in which P2 should replay message m2. The content of these messages is
obtained from the sender log of P0.
Explain Koo–Toueg coordinated check pointing algorithm in detail. (or)
Explain Coordinated checkpointing algorithm in detail.
Coordinated check pointing and recovery technique that takes a consistent set of check pointing and
avoids domino effect and livelock problems during the recovery.
This algorithm includes 2 parts:
1. check pointing algorithm
2. Recovery algorithm.
Checkpointing algorithm
The following are the assumptions made in checkpointing algorithm:
FIFO channel
end-to-end protocols
single process initiation
no process failures during the execution of the algorithm
The algorithm facilitates two kinds of checkpoints:
Permanent checkpoints
Tentative checkpoints
The algorithm is implemented in two phases:
Phase I:
Initiating process Pi takes a tentative checkpoint and requests all other processes to take tentative
checkpoints.
Every process cannot send messages after taking tentative checkpoint.
All processes will finally have the single same decision: do or discard.
A process says no to a request if it fails to take a tentative checkpoint.
If Pi learns that all the processes have successfully taken tentative checkpoints, Pi decides that all
tentative checkpoints should be made permanent; otherwise, Pi decides that all the tentative checkpoints
should be discarded.
Phase II:
Pi propagates its decision to all processes.
On receiving the message from Pi ,all process act accordingly.
Correctness of the algorithm
Either all or none of the processes take permanent checkpoint
No process sends message after taking permanent checkpoint.
Rollback recovery algorithm
This algorithm restore the system state to a consistent state after a failure.
Phase I:
An initiating process Pi sends a message to all other processes to check if they are willing to restart
from their previous checkpoints.
A process may reply no to a restart request due to any reason.
If Pi learns that all processes are willing to restart from their previous checkpoints, Pi decides that all
processes should roll back to their previous checkpoints.
Otherwise, Pi aborts the rollback attempt and it may attempt a recovery at a later time.
Phase II:
Pi propagates its decision to all processes.
On receiving the message from Pi ,all process act accordingly.
Correctness
All processes restart from an appropriate state because, if they decide to restart, they resume execution from a
consistent state.
Explain Juang-Venkatesan algorithm for asynchronous checkpointing and recovery in detail.
Assumptions
communication channels are reliable
delivery messages in FIFO order
infinite buffers
message
transmission delay is arbitrary but finite
Two type of log storage are maintained
Volatile log: short time to access but lost if processor crash.
Stable log: longer time to access but remained if crashed.
Asynchronous checkpointing:
After executing an event, a processor records a triplet (s, m, msg_sent) in its volatile storage.
s: state of the processor before the event
m: message
msgs_sent: set of messages sent by the processor during the event.
Local checkpoint consist of set of records, first are stored in volatile log, then moved to stable log.
Recovery algorithm
Notations:
𝑅𝐶𝑉𝐷𝑖←𝑗 (𝐶𝑘𝑃𝑡𝑖): number of messages received by 𝑝𝑖 from 𝑝𝑗 , from the beginning of computation to checkpoint
𝐶𝑘𝑃𝑡𝑖
𝑆𝐸𝑁𝑇𝑖→𝑗 (𝐶𝑘𝑃𝑡𝑖): number of messages sent by 𝑝𝑖 to 𝑝 , from the beginning of computation to checkpoint 𝐶𝑘𝑃𝑡i
Idea:
From the set of checkpoints, find a set of consistent checkpoints
This is done based on the number of messages sent and received.
Recovery may involve multiple iterations of roll backs by processors.
Whenever a processor rolls back, it is necessary for all other processors to find out if any message sent
by the rolled back processor has become an orphan message.
The orphan messages are identified by comparing the number of messages sent to and received from
neighboring processors.
When a processor restarts after a failure, it broadcasts a ROLLBACK message that it has failed.
Because of the broadcast of ROLLBACK messages, the recovery algorithm is initiated at all processors.