1
Logical clocks
Copyright 2004-2007 Bo I. Sandn, Colorado Technical University
CJ 3.4.2; TS 5.2; CDK 11.2; WS 14.3
It is a challenge to make all nodes in a distributed system reach consensus regarding the current time. It is particularly difficult to make all nodes agree on the same physical time. Usually, you make each node contact some Universal Coordinated Time (UTC) source and compensate for the communication delay. The communication delay can be estimated as 0.5 (Tr Ts tp) where Tr: Ts: tp: The time the response is received The time when the request is sent The (known) processing time at the server
Lamport's logical clock (1978, 1990)
It's often sufficient that nodes agree on the order of events, disregarding the actual time of each event and the amount of time between them. A logical clock indicates such an order. The basic one is called Lamports logical clock
is the happens-before relation for events a b means that a happens before b It describes causality. It does not accurately reflect distances in time. It is transitive (i. e., a b and b c implies a c) Happens-before is a partial ordering of the events in a distributed system. Two events, a and b, are said to be disjoint (and can be concurrent) if neither a b nor b a is true This is also a relation. That a and b are disjoint can be shown as a || b
An aside: Relations
XRY means that X is related to Y under a particular relation R. Examples: X < Y and X Y and X = Y A relation is defined as a set of ordered pairs So, the relation < is the set of all pairs (X, Y) where X < Y, etc.
Relations can have the following properties (among others):
Relation R is reflexive if XRX for all X Examples: XX, X=X; not true for < R is antisymmetric if, for all X and Y, (XRY and YRX) implies that X and Y are identical Example: XY and YX X=Y R is transitive if for all X, Y, Z, (XRY and YRZ) implies XRZ Examples: X<Y and Y<Z X<Z Same for and =
A partial order is a relation that is reflexive, antisymmetric and transitive Examples: and How about || ?
Implementation of Lamport's logical clock
Each process (node) Pi maintains a logical clock in the form of a counter Ci The logical clock of a process is incremented by some (arbitrary) positive number for each new event Processes interact by means of send and receive event pairs Rules: 1. If a b within the same process, then C(a) < C(b)
2.
If a is the event where Pi sends a message and b is the event where Pj receives the same message, then Ci (a) < Cj (b)
Rule 2 can be enforced if: the sending process increments its logical clock and timestamps its message with it, and the receiver sets its logical clock to the larger of: its own current clock value and the messages timestamp, incremented
Total order of events
A problem with Lamports clock so far is that disjoint events (in different processes) can have the same clock value by coincidence.
A total order can be achieved by stipulating:
3.
For all events a and b where a b, C(a) C(b)
Disjoint events with identical logical clocks can be distinguished by concatenating their clocks with a distinct process id number. Suppose an event in each process has the un-concatenated clock value 100. If the processes have the ids 1 and 2, the concatenated clock values may be 100.1 and 100.2 That is, disjoint events are arbitrarily ordered
Vector logical clocks
Even with total ordering, there is still the problem that C(a) < C(b) does not imply a b That is, a and b may be disjoint even if one's clock value is strictly less than the other's
This can be solved with a vector logical clock:
The vector logical clock for an event a at process i is a vector containing one element, TSk, for each cooperating process, k.
The value of TSi is the logical time, Ci(a), of event a at process i The other values in the vector are process is best estimates of the time at each of the other process.
TSk is the time when process k was last heard from, directly or indirectly.
This allows us to redefine < so that for disjoint event pairs, a and b, where a b, neither a < b nor b < a holds
Example: Event a happens at node i. Its clock vector is:
VCi (a) = TS1 TS2 .... Ci(a) .... TSn
what node i thinks the time is at node 1 what node i thinks the time is at node 2
Lamport time for event a at node i (as earlier)
what node i thinks the time is at node n
Each value TSj is less or equal to the current Lamport time at node j
When a process sends a message to another process, n, it sends along its entire vector with the message Upon receipt, n updates its own entire vector: It increments its own time. For each node, i, it takes the maximum of: its own estimate of the time at i, and TSi from the messages clock vector
Now, define the operator < for vectors such that V1 < V2 means that: V1[k] V2[k] for all elements k, and V1[m] < V2[m] for at least one element m With this definition, VC (a) < VC (b) implies a b, that is, if a has a lower clock value than b, then a happened before b We can prove this by showing that if a and b are disjoint, then VC (a) < VC (b) and VC (b) < VC (a) are both false
Proof: Assume that events a and b are disjoint and occur at process i and j respectively. Their vector clocks are: Process i a: .... Ci(a) .... TSj .... b: Process j .... TSi .... Cj(b) ....
Ci (a) is more current so Ci (a) > TSi at node j Cj (b) is more current so Cj (b) > TSj at node i
Partial order of events
A B C D E F G H
P1
I J K L M N O
P2
P Q R S T U
P3
By transitivity, A --> K, A --> L, R --> N, R--> H, etc. (B, J), (D, L), (M, S), etc. are pairwise disjoint. Let B (in process P1) have the vector VB = [C1(B), TS12, TS13] Let J (in process P2) have the vector VJ = [TS21, C2(J), TS23]. We want to show by contradiction that VB < VJ and VJ < VB are both false. Proof: Assume VB < VJ. Then, by definition of "<", each element of VB must be less than or equal to the corresponding element of VJ, and at least one element of VB must be less than the corresponding element of VJ But C1(B) is greater than TS21 since TS21 cannot be greater than the time of the last message from P1, and P1's clock has advanced since. Similarly, VJ cannot be less than VB.
10
Exercise 1
The four processes P1, P2, P3 and P4 in a distributed system communicate as shown in this diagram:
P1
P2
P3
P4
The bullets indicate local events.
a) Lamports clock is used. Initially each process has the clock value 0.
What is each processs clock value at the end of the scenario shown in the diagram?
b) Vector clocks are used. Initially, each processs clock is (0, 0, 0, 0).
What is each processs clock vector at each at the end of the scenario?
11
Exercise 2
Same description as Exercise 1 (but only 3 processes)
P1
P2
P3
Exercise 3
Same description as Exercise 1.
P1
P2
P3
P4
Clocks 2007-06-28