Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Distributed Systems
lack of a global time
consider using the local clock of each machine
e.g. make program with files being accessed in different machines
consider using the clock of a certain machine
the communication delay can result in the same problem
Physical Clocks
synchronize the physical clocks in the machines so that the time difference
is limited to a very small value
Logical Clocks
in distributed computation, associating an event to an absolute real time is
not essential, we only need to know an unambiguous order of events
Lamport's algorithm
Fidge's algorithm (vector clock)
Logical Clock Synchronization
p1 p2 p3 p4
P
q1 q2 q3 q4 q5
Q
r1 r2 r3
R
1. p1 p2 p3 ..........
2. q1 q2 q3 ..........
3. p1 q2, q1 p2, q4 r3, q5 p4
4. p3 || q3, p3 || q4, q3 || r2, q4 || r1, p2 || r2
Lamports Logical Clock Synchronization
This algorithm gives a partial ordering of events, i.e., two events can
have the same logical time
It may cause ambiguity for some applications
Ordering Events Totally
p1 p2 p3 p4
P
q1 q2 q3 q4 q5
Q
r1 r2 r3
R
Mechanism
for each timestamp, use a vector instead of a single value
for example: V(a) = (2, 3, 5)
notation: V1(a) = 2, V2(a) = 3, V3(a) = 5
Vi(a) corresponds to processor Pi
Properties
V(a) V(b) iff i, Vi(a) Vi(b) -- e.g., (123) (133)
V(a) V(b) iff i, Vi(a) Vi(b)
V(a) < V(b) iff V(a) V(b) and V(a) V(b)
-- this is the happened before relationship, e.g., (123) < (133)
a || b iff (V(a) < V(b)) and (V(b) < V(a))
-- concurrent relationship, e.g., (123) and (321)
Fidges Algorithm
Initialization
the vector timestamp for each processor is initialized to (0,0,,0)
Local event
when an event occurs on processor Pi, Vi(Pi) Vi(Pi) + 1
e.g., at processor 3, (1,2,1,3) (1,2,2,3)
Message passing
when Pi sends a message to Pj, the message has timestamp V(Pi)
when Pj receives the message, it sets V(Pj) to max (V(Pi), V(Pj))
e.g., P2 receives a message with timestamp (3,2,4) and P2s timestamp is
(3,4,3), then P2 adjust its timestamp to (3,4,4)
function max: Vi(Pi) = max (Vi(P1), Vi(P2), Vi(P3), )
Synchronization point
when a set of processes are involved in a synchronous event, all of the
processes maximize their local clocks
Example for Partially Ordered Vector Timestamp
Debugging
the order of the events during debugging should be the same as that during
execution
concurrency measure
partial ordered logical time can help construct a correct computation graph
Physical Clock Synchronization
in some systems (e.g., real time systems), the actual clock time is
important
need to synchronize clocks with real-world clocks
clocks never run at the same rate, they drift further and further apart
need to synchronize physical clocks with each other
How is Accurate Time Determined?
In the 17th century, the solar second (1/86400 solar day) is used. A
solar day is the time interval between the two consecutive events
where the sun reaches its highest point in the sky.
an average is taken for solar day
solar day becomes longer
Currently, 50 labs have cesium 133 clocks. Each lab periodically tells
the BIH how many times its clock has ticked. BIH averages them and
produce TAI (international atomic time).
leap seconds are used to correct the discrepancy between TAI and solar
time
this standard time is UTC (universal coordinated time)
UTC is the basis of all modern civil time-keeping
How UTC is Provided to People?
Requirements:
T0
M
TUTC I
M
T1
The Berkeley Algorithm
a distributed algorithm
every processor broadcasts its time at the beginning of a preset interval
collects time from other processors and computes the average
can also factor in the communication delay
Causal Ordering of Messages
Goal: If send (m1) send (m2), then every recipient of m1 and m2 should
receive m1 before m2.
Many distributed applications requires such message ordering guarantee
1. P broadcasts a message m:
CP(P) = CP(P) + 1; attach C(P) to m;
2. Q receives a message m from P with timestamp C(m):
if a) CP(Q) = CP(m) 1
all messages from Pj prior to m have been received
and
b) Cx(Q) Cx(m), x set of all processors {P}
all messages from other processors have been received
then Q can accept m and set CP(Q) = CP(m)
(110)
Q
(100)
(100) (110)
R
cannot
accept
Causally Ordered Broadcast Messages -- Example
Q
(010) (110)
(010) (110)
R
Discussion
The timestamp essentially is for message counting purpose
When count is not matching, do not deliver the message
But only suitable for broadcast messages, cannot be used for point-to-point
message passing
Does not guarantee the same order of delivery to all recipients for
concurrent messages
Causally Ordered Point-to-Point Messages
m1 (100) m2 (200)
Q(100)
(110) (220)
Q(100) Q
Algorithm
P sends message m to Q
1. increase C(P)
2. V(m) = V(P); C(m) = C(P); send m;
3. insert {Q, C(P)} to V(P);
insert {X, C(Y)} to V(Z) VX(Z) = max(VX(Z),C(Y))
Q receives message m from P
increase C(Q) for the message receiving event;
if VQ(m) C(Q) then
Example
deliver m; C(Q)=(231)
for all X, insert {X, VX(m)} to V(Q); VQ(m)=(213)
buffer m
C(Q) = max (C(Q), C(m));
check for buffered messages that can now be delivered;
else buffer m;
Example of Ordering of Point-to-Point Messages
(201) (202)
R
Q(100) Q(202)
Q(100)
Q(100) R(200)
(000) (100) (200)
P
m1 Q(100)
(110) (222)
(010) Q
(200) m2 (202)
Q(100) m3 Q(100)
R
(201) (202)
Q(100) Q(202)
If m3 arrives at Q before m1
Q has timestamp (010) when m3 arrives
the Q component of m3s history vector has Q(100)
it indicates: one outstanding message from P should be received before m3
Example of Ordering of Point-to-Point Messages
(201) (222)
R
Q(100) Q(100)
R
(121) (222)
Q(100)
if m3 arrives at R before m2
R has timestamp (001) when m3 arrives
the Q component of m3s history vector has Q(000)
no problem, deliver m3 (m2 and m3 are concurrent)
Summary
L. Lamport, "Time, clocks, and the ordering of events in a distributed
system," Communications of the ACM, vol. 21, no. 7, pp.558-564, July
1978.
C. Fidge, "Logical time in distributed computing systems," IEEE
Computer, vol. 24, pp. 28-33, Aug. 1991.
Kenneth Birman, Andre Schiper, Pat Stephenson, "Lightweight causal
and atomic group multicast," ACM Transactions on Computer
Systems, Vol. 9, No. 3, Aug. 1991.
Andre Schiper, Jorge Eggli, Alain Sandoz, "A new algorithm to
implement causal ordering," Lecture Notes In Computer Science, Vol.
392, 1989
Distributed Computing, Principles, Algorithms, and Systems, by A.D.
Kshemkalyani and M. Singhal, Cambridge
Chapter 3