Sei sulla pagina 1di 35

Clock Synchronization in Centralized Systems

a process makes a kernel call to get the time


process which tries to get the time later will always get a higher (or
equal) time value
no ambiguity in the order of events and their time
Clock Synchronization in Distributed Systems

Distributed Systems
lack of a global time
consider using the local clock of each machine
e.g. make program with files being accessed in different machines
consider using the clock of a certain machine
the communication delay can result in the same problem

113 114 115 116 117


Machine A
(compile) f.o created
111 112 113 114 115
Machine B
(edit) f.c created f.c updated
10:00 10:02 10:03
Clock Synchronization

Physical Clocks
synchronize the physical clocks in the machines so that the time difference
is limited to a very small value

Logical Clocks
in distributed computation, associating an event to an absolute real time is
not essential, we only need to know an unambiguous order of events
Lamport's algorithm
Fidge's algorithm (vector clock)
Logical Clock Synchronization

Definition: "happened before" relationship a b


if a and b are events in the same processor, and a occurred before b,
then a b
if a is the event of sending a message from processor A and b is the
event of receiving the same message by processor B, then a b
if a b and b c then a c
Definition: concurrent relationship a || b
two distinct events a and b are "concurrent" if (a b) and (b
a)
Examples for the two Relationships

p1 p2 p3 p4
P
q1 q2 q3 q4 q5
Q

r1 r2 r3
R

1. p1 p2 p3 ..........
2. q1 q2 q3 ..........
3. p1 q2, q1 p2, q4 r3, q5 p4
4. p3 || q3, p3 || q4, q3 || r2, q4 || r1, p2 || r2
Lamports Logical Clock Synchronization

Goal: assign a timestamp C(x) to an event x


Requirement: for any event a and b, if a b then C(a) < C(b)
Algorithm:
1. Each processor Pi increments Ci between any two successive local events
2. If event a is the event of sending a message m by Pi then we put the time
stamp Tm = Ci (a) in m. Upon receiving m, processor Pj sets Cj >= max
(Tm , current time)

This algorithm gives a partial ordering of events, i.e., two events can
have the same logical time
It may cause ambiguity for some applications
Ordering Events Totally

Goal: assign totally ordered timestamps to events


Requirements
if a b then T(a) < T(b)
T(a) T(b) for any two different events a and b
Algorithm
if Ci (a) Cj (b), then T(a) = Ci(a) and T(b) = Ci(b)
if Ci (a) = Cj (b) and i < j, then T(a) < T(b)
use processor id in the time stamp to force the total order

Limitation of Lamport's Logical Clock


if a b then C(a) < C(b), but if C(a) < C(b) then a b is not necessary
true, i.e. concurrency information is lost
Examples for Lamports Timestamp

p1 p2 p3 p4
P
q1 q2 q3 q4 q5
Q

r1 r2 r3
R

Initial condition: C(P) = 0, C(Q) = 2, C(R) = 0


Processor ids: pid(P) = 0, pid(Q) = 1, pid(R) = 2

Partially ordered Lamports Totally ordered Lamports timestamps:


timestamps: p1 = 10, p2 = 40, p3 = 50, p4 = 80
p1 = 1, p2 = 4, p3 = 5, p4 = 8 q1 = 31, q2 = 41, q3 = 51, q4 = 61, q5 = 71
q1 = 3, q2 = 4, q3 = 5, q4 = 6, q5 = 7 r1 = 12, r2 = 22, r3 = 72
r1 = 1, r2 = 2, r3 = 7
Fidges Partially Ordered Timestamp

Mechanism
for each timestamp, use a vector instead of a single value
for example: V(a) = (2, 3, 5)
notation: V1(a) = 2, V2(a) = 3, V3(a) = 5
Vi(a) corresponds to processor Pi
Properties
V(a) V(b) iff i, Vi(a) Vi(b) -- e.g., (123) (133)
V(a) V(b) iff i, Vi(a) Vi(b)
V(a) < V(b) iff V(a) V(b) and V(a) V(b)
-- this is the happened before relationship, e.g., (123) < (133)
a || b iff (V(a) < V(b)) and (V(b) < V(a))
-- concurrent relationship, e.g., (123) and (321)
Fidges Algorithm

Initialization
the vector timestamp for each processor is initialized to (0,0,,0)
Local event
when an event occurs on processor Pi, Vi(Pi) Vi(Pi) + 1
e.g., at processor 3, (1,2,1,3) (1,2,2,3)
Message passing
when Pi sends a message to Pj, the message has timestamp V(Pi)
when Pj receives the message, it sets V(Pj) to max (V(Pi), V(Pj))
e.g., P2 receives a message with timestamp (3,2,4) and P2s timestamp is
(3,4,3), then P2 adjust its timestamp to (3,4,4)
function max: Vi(Pi) = max (Vi(P1), Vi(P2), Vi(P3), )
Synchronization point
when a set of processes are involved in a synchronous event, all of the
processes maximize their local clocks
Example for Partially Ordered Vector Timestamp

(110) (230) (330)


P
(360)
Q
(010) (020) (030) (040) (350)
R
(021) (362)
Applications of Partially Ordered Events

Debugging
the order of the events during debugging should be the same as that during
execution

roll back recovery


definition of global states

concurrency measure
partial ordered logical time can help construct a correct computation graph
Physical Clock Synchronization

in some systems (e.g., real time systems), the actual clock time is
important
need to synchronize clocks with real-world clocks
clocks never run at the same rate, they drift further and further apart
need to synchronize physical clocks with each other
How is Accurate Time Determined?

In the 17th century, the solar second (1/86400 solar day) is used. A
solar day is the time interval between the two consecutive events
where the sun reaches its highest point in the sky.
an average is taken for solar day
solar day becomes longer

In 1948, a second is defined to be the time it takes the cesium 133


atom to make 9, 192, 631, 770 transitions.
How is Accurate Time Determined?

Currently, 50 labs have cesium 133 clocks. Each lab periodically tells
the BIH how many times its clock has ticked. BIH averages them and
produce TAI (international atomic time).
leap seconds are used to correct the discrepancy between TAI and solar
time
this standard time is UTC (universal coordinated time)
UTC is the basis of all modern civil time-keeping
How UTC is Provided to People?

In America, a short pulse is broadcast from Fort Collins, Colorado, at


the start of each UTC second.
In England, a similar service is provided.
Several earth satellites also offer similar services
In order to compensate for the signal propagation delay, the relative
position of the sender and receiver need to be known.
Crystal Clocks
crystal clocks are used for most current clocks
keep a quartz crystal under tension, it will emit a well defined
frequency (the frequency depends on the cutting angle and tension)
a counter register is associated with the crystal, each oscillation of the
crystal decrements the counter by one, when the counter is zero, an
interrupt is generated and the counter is reload from the holding
register
When to Synchronize Physical Clocks?

Requirements:

A true physical clock should run at an approximately correct rate, i.e.,


there exists a constant , such that |dC(t)/dt 1| < , for all i. For
typical crystal clocks, 10-6.

Any two physical clocks must be synchronized, i.e., there exist a


sufficiently small constant , s.t. |Ci(t) Cj(t)| < , for all i, j.
The clocks connected as a graph with a diameter d need to be
resynchronized every /2d time.
Cristian's Algorithm

assume there is a time server with accurate time


client sends request to get the accurate time, upon receiving the
response, it sets clock to TU + (T1 T0 I) /2
if time should be set backwards, then the clock slows down, increases
9 sec instead of 10 sec per interrupt (assume each interrupt causes
time to increase 10 sec)
S C

T0
M

TUTC I
M
T1
The Berkeley Algorithm

time server polls every machine periodically to get its time


computes the average time and asks other machines to advance or slow
down their clocks

Distributed Averaging Algorithm

a distributed algorithm
every processor broadcasts its time at the beginning of a preset interval
collects time from other processors and computes the average
can also factor in the communication delay
Causal Ordering of Messages

Goal: If send (m1) send (m2), then every recipient of m1 and m2 should
receive m1 before m2.
Many distributed applications requires such message ordering guarantee

Basic Idea for Implementation:


buffer m2 and deliver it only after m1 is delivered
attach a time stamp to each message so that the receiver can decide whether
there is a message preceding it
Causal Ordering of Messages

casually ordered broadcast messages


by Birman-Schiper-Stepheson
processes communicate via broadcast messages
whether there are prior messages can be determined by a vector timestamp

casually ordered messages


by Schiper-Eggli-Sandoz
allow point-to-point communication
need to keep track of the last messages sent by all processors to each
specific node
Causally Ordered Broadcast Messages

Notation in the algorithm


P, Q, R: Processes
C(X): Vector timestamp of X, X can be P or Q or R
Cp(X): the element associated with P in C(X)
C(X) = (2, 3, 1)
CP(X) = 2, CQ(X) = 3, CR(X) = 1
Cp(Q) is the number of messages that Q received from P
Cp(P) indicates the number of messages broadcast by P
Causally Ordered Broadcast Messages -- Algorithm

1. P broadcasts a message m:
CP(P) = CP(P) + 1; attach C(P) to m;
2. Q receives a message m from P with timestamp C(m):
if a) CP(Q) = CP(m) 1
all messages from Pj prior to m have been received
and
b) Cx(Q) Cx(m), x set of all processors {P}
all messages from other processors have been received
then Q can accept m and set CP(Q) = CP(m)

Examples: P sent a message m to Q


C(Q) = (1 3 3), C(m) = (2 3 2) deliver the message
C(Q) = (1 3 3), C(m) = (2 2 4) buffer the message (R message missing)
Causally Ordered Broadcast Messages -- Example

(000) (100) (110)


P

(110)
Q
(100)

(100) (110)
R
cannot
accept
Causally Ordered Broadcast Messages -- Example

(000) (100) (110)


P

Q
(010) (110)

(010) (110)
R

Concurrent messages may be delivered in different orders


on different processors
Causally Ordered Broadcast Messages

Discussion
The timestamp essentially is for message counting purpose
When count is not matching, do not deliver the message
But only suitable for broadcast messages, cannot be used for point-to-point
message passing
Does not guarantee the same order of delivery to all recipients for
concurrent messages
Causally Ordered Point-to-Point Messages

C(P): the timestamp of processor P


C(m): the timestamp of message m
V(P): the history vector of P, it consists of multiple vector timestamps
and is used to keep track of message passing activities
V(m): assume that P is the sender, V(m) = V(P) at the time m is sent, but
before V(P) is updated
VP(Q): the vector timestamp associated with P in vector V(Q)
it contains the history information about messages sent to P from all
processors, but only as far as Q knows of
VPR(Q): the element related to R in VP(Q)
the latest time value when R sent a message to P as far as Q knows
Example of History Vector

(000) (100) (200)


Q(100) Q(200) P

m1 (100) m2 (200)
Q(100)

(110) (220)
Q(100) Q

at (100), P sends a message m1 to Q, it puts timestamp (100) in m1


before sending the message, Ps history vector is empty
after sending the message, Ps history vector becomes Q(100)
at (200), P sends m2 to Q, it puts timestamp (200) in m2
history vector (Q(100)) is placed in m2
If Q receives m2 before m1, Q knows it from the history vector in m2 that it
should wait for m1
Causally Ordered Point-to-Point Messages

Algorithm
P sends message m to Q
1. increase C(P)
2. V(m) = V(P); C(m) = C(P); send m;
3. insert {Q, C(P)} to V(P);
insert {X, C(Y)} to V(Z) VX(Z) = max(VX(Z),C(Y))
Q receives message m from P
increase C(Q) for the message receiving event;
if VQ(m) C(Q) then
Example
deliver m; C(Q)=(231)
for all X, insert {X, VX(m)} to V(Q); VQ(m)=(213)
buffer m
C(Q) = max (C(Q), C(m));
check for buffered messages that can now be delivered;
else buffer m;
Example of Ordering of Point-to-Point Messages

(000) (100) (200) P


Q(100) Q(100)
R(200)
(100) m1
(222)
(110) Q(100)
Q
(200) m2
m3 (202)
Q(100) Q(100)

(201) (202)
R
Q(100) Q(202)

Q should receive m3 after m1


If m3 arrives at Q before m1, what will happen?
Example of Ordering of Point-to-Point Messages

Q(100)
Q(100) R(200)
(000) (100) (200)
P

m1 Q(100)
(110) (222)
(010) Q
(200) m2 (202)
Q(100) m3 Q(100)

R
(201) (202)
Q(100) Q(202)

If m3 arrives at Q before m1
Q has timestamp (010) when m3 arrives
the Q component of m3s history vector has Q(100)
it indicates: one outstanding message from P should be received before m3
Example of Ordering of Point-to-Point Messages

(000) (100) (200) P


Q(100) Q(100)
R(200)
(100) m1
(120)
(110) R(120) Q
(200) m2 (120)
Q(100) m3

(201) (222)
R
Q(100) Q(100)

R can receive m2 and m3 in any order, they are concurrent messages


If m3 arrives at R before m2, what happens?
Example of Ordering of Point-to-Point Messages
Q(100)
Q(100) R(200)
(100) (200)
(000)
P
(100) m1
(120)
(110) R(120)
Q
(200)
(120) m2
m3 Q(100)

R
(121) (222)
Q(100)

if m3 arrives at R before m2
R has timestamp (001) when m3 arrives
the Q component of m3s history vector has Q(000)
no problem, deliver m3 (m2 and m3 are concurrent)
Summary
L. Lamport, "Time, clocks, and the ordering of events in a distributed
system," Communications of the ACM, vol. 21, no. 7, pp.558-564, July
1978.
C. Fidge, "Logical time in distributed computing systems," IEEE
Computer, vol. 24, pp. 28-33, Aug. 1991.
Kenneth Birman, Andre Schiper, Pat Stephenson, "Lightweight causal
and atomic group multicast," ACM Transactions on Computer
Systems, Vol. 9, No. 3, Aug. 1991.
Andre Schiper, Jorge Eggli, Alain Sandoz, "A new algorithm to
implement causal ordering," Lecture Notes In Computer Science, Vol.
392, 1989
Distributed Computing, Principles, Algorithms, and Systems, by A.D.
Kshemkalyani and M. Singhal, Cambridge
Chapter 3

Potrebbero piacerti anche