Distributed Algorithms. Nancy A. Lynch (1993) (811416), страница 53
Текст из файла (страница 53)
(Of course, when we reorder events,we will have to make some adjustments to the intervening states.) We will show that themodied execution contains fewer than k sessions, which contradicts the correctness of A.The reordering must preserve certain dependencies in order to produce a valid execution ofA. Formally, we shall prove the following claim.Claim 3 Let i0 = i2 = i4 = = i, and i1 = i3 = i5 = = j , where dist (i j ) = diam (G).For all r = 1 : : : k ; 1 there exists a \reordering" r = r r of r (i.e., the actions arereordered, with the initial and nal states unchanged), such that the following propertieshold.1.
The reordering preserves the following dependency partial order.(a) The order of all actions of the same node.(b) The order of each send i j (m) event and its corresponding receive i j (m) event.2. r contains no event of ir;1 .3. r contains no event of ir .Let us rst show how to complete the proof of Theorem 2 using Claim 3. Since thereordering preserves the dependencies, this means that 12 : : : k;1 00 is a fair execution ofA. But we can show that this execution contains at most k ; 1 sessions as follows. No sessioncan be entirely contained within 1, since 1 contains no event of i0.
Likewise, no session canbe entirely contained within r;1r , since this sequence contains no event of process ir;1.This implies that each session must contain events on both sides of some r r boundary.Since there are only k ; 1 such boundaries, the theorem follows.262It remains to prove Claim 3. To do this, we produce the required reorderings. Theconstruction is done independently for each r.
So x some r, 1 r k ; 1. We considertwo cases.Case 1: r contains no event of ir;1 . In this case we dene r = r and r is empty.Case 2: r contains an event of ir;1. Let be the rst event of ir;1 in r . Now we claimthat there is no step of ir in r that depends on (according to the dependency relationdened in Claim 3). This is true since the time for a message to propagate from ir;1 to ir ina slow execution is at least diam d, whereas we have constructed r so that the time betweenits rst and last events is strictly less than diam d.
Thus we can reorder r so that all thesteps of ir precede : we keep the initial and nal states the same, and mimic the local statechanges as in r . This still yields allowable transitions of the algorithm, by a dependencytheorem similar to the one we used for the synchronizer construction (but simpler). Let rbe the part of the reordered sequence before , and r the part of the sequence starting with. It is straightforward to verify that r = r r has the properties required to prove Claim3.Note that the reordered sequence is not a timed execution | if we were trying to preservethe times somehow during the reordering, we would be stretching and shrinking some timeintervals.
We would have to be careful to observe the upper bound requirements. We couldavoid these problems by making all the time intervals very small, but we don't need to worryabout this at all, since all we need to get the contradiction is an untimed fair execution.Theorem 2 almost looks like a contradiction to the synchronizer result | recall that thesynchronizer gives a simulation with constant time overhead. This is not a contradiction,however, because the synchronizer simulation doesn't preserve the total external behavior:it only preserves the behavior at each node, while it may reorder the events at dierentnodes. We remark that for most purposes, we might not care about global reordering, butsometimes, when there is some \out-of-band" communication, the order of events at dierentnodes might be important.2636.852J/18.437J Distributed AlgorithmsLecturer: Nancy LynchNovember 24, 1992Lecture 2121.1 Time, Clocks, etc.The idea that it is OK to reorder events at dierent nodes motivates Lamport's importantnotion of \logical time", dened in his famous paper Time, Clocks, and the Ordering ofEvents in a Distributed System .21.1.1 Logical TimeThe model assumed is the one we have been using.
In this model, there is no built-in notionof real-time (in particular, there are no clocks), but it is possible to impose a logical notion oftime, namely Lamport's logical time. Specically, every event of the system (send or receiveof messages, external interface event, internal event of node) gets assigned a logical time,an element of some xed totally ordered set. Typically, this set is either the nonnegativeintegers or the nonnegative reals (perhaps with tie-breakers). These logical times don't haveto have any particular relationship to any notion of real time.
However, they do need tosatisfy the following properties.1. No two events get assigned the same logical time.2. Events at each node obtain logical times that are strictly increasing, in the same orderas the events occur.3. For any message, its send event gets a strictly smaller logical time than its receiveevent.4. For any event, there are only nitely many other events that get assigned logical timesthat are smaller.The important result about such an assignment is as follows. Suppose that we are givenan execution of a network system of IO automata and a logical time assignment ltimethat satises the conditions above.
Then there is another execution 0 of the same system,264which looks the same to each node (i.e., ji = 0ji for all i), and in which the times fromthe assignment ltime occur in increasing order in real time. In other words, we can have areordered execution, such that the logical time order of the original execution is the same asthe real time order in the reordered execution. So we can say that programming in terms oflogical time looks to all the users as if they are programming in terms of real time.We can view this as another way to reduce the uncertainty of an asynchronous networksystem. Roughly speaking, we can write asynchronous programs where we assume that eachnode has \access to" real time.
(Real time modeling doesn't t the IOA model, which iswhy this is rough. It will be done more carefully later in the course when we do timingbased models.) In the basic model, the nodes don't have this access. Instead, we can havethem implement logical time somehow, and let them use the logical time in place of the realtime. The reordering result says that the resulting executions will look the same to all nodes(separately).One generalization is useful: suppose we want to augment the model to allow broadcastactions, which serve to send the same message to all the other nodes.
We can implementthis, of course, with a sequence of individual send actions. There is nothing too interestinghere thus far. But the important thing here is that we can allow a broadcast to have asingle logical time, and require that all the associated receives have larger logical times. Thereordering result still holds with this new action.21.1.2 AlgorithmsLamport's implementation. Lamport presents a simple algorithm for producing suchlogical times.
It involves each node process maintaining a local variable clock that it uses asa local clock. The local clock gets incremented at each event (input, output or internal) thatoccurs at that node. The logical time of the event is dened to be the value of the variableclock , immediately after the event, paired with the process index as a tie-breaker.This algorithm is easily seen to ensure Properties 1 and 2 above. In order to ensureProperty 3, the following rule is observed. Whenever a node does a send (or broadcast )event, it rst increments its clock variable to get the logical time for the send event, andit attaches that clock value to the message when it sends it out.
When the receiver of themessage performs a receive event, it makes sure that it increments its clock variable to benot only larger than its previous value, but also strictly larger than the clock value in themessage. As before, it is this new clock value that gets assigned to the receive event. Toensure Property 4, it suces to allow each increase of a clock variable to increase the valueby some minimum amount (e.g., 1).265Welch's Implementation. In Welch's algorithm, as in Lamport's, the logical time ateach node is maintained by a state variable named clock , which can take on nonnegativeinteger or real values. But this time, we assume that there is an input action tick (t) thatis responsible for setting this variable. The node sets its clock variable to t when tick (t) isreceived.
We can imagine this input as arriving from a separate \clock" IOA. (The nodeimplements both this clock and the process IOA.)Now each non-tick action gets its logical time equal to the value of clock when it isperformed, with tie breakers: rst, by the process index, and second, by execution order atthe same process. Note that the clock value does not change during the execution of thisaction. Some additional careful handling is needed to ensure Property 3. This time, we havean extra FIFO receive-buer at each node, which holds messages whose clock values are atleast the clock value at the local node. Assuming that the local clock value keeps increasingwithout bound, eventually it will be big enough so that it dominates the incoming message'sclock value.
The rule is to wait for the time where the rst message in the receive-buer hasan associated clock less than the local clock . Then this message can be processed as usual:we dene the logical time of the receive event in terms of the local clock variable when thissimulated receive occurs.The correctness of this algorithm is guaranteed by the following facts. Property 1 isensured by the the tie-breakers Property 2 relies on the monotonicity of the local clocks plusthe tie-breakers Property 3 is guaranteed by the buer discipline and Property 4 requiresa special property of the clock automaton, namely that it must grow unboundedly.11 Thiscould be achieved, e.g., by having a positive lower bound on the tick increment, and havingthe clock automaton execute fairly.Note that this algorithm leads to bad performance when the local clocks get out of synch.For this to work in practice, we need some synchronization among the clocks, which is norreally possible in a purely asynchronous system.