Distributed Algorithms. Nancy A. Lynch (1993) (811416), страница 58
Текст из файла (страница 58)
Then by dining lockout-freedom andLemma 1 eventually the dining subroutine grants at i. Again by Lemma 1, eventually ireturns the dining resource. But note that i only returns the dining resource if in the interimit has done a drinking grant.
The formal proof is by a contradiction, as usual.283State:drink-region: equals T if the most recent drinking action was T (B ) (for some B ), C ifC (B), etc. Initially R.dine-region: equals T if the most recent dining action was T , C if C , etc.
Initially R.need: equals B , where the most recent drinking action had parameter B . Initiallyempty.bottles: set of tokens that have been received and not yet enqueued to be sent. Initiallythe token for a resource shared between D and D is in the bottles variable of exactlyone of D and D , with the choice being made arbitrarily.deferred: set of tokens in the set bottles for which a request or demand has been receivedsince the token was received.
Initially empty.current: Boolean indicating whether current dining critical region is on behalf of currentdrinking trying region. Initially false.msgsj ] for all j 6= i: FIFO queue of messages for D enqueued but not yet sent. Initiallyempty.iiiiiijjjActions for process i:try (B ) for all B BEect: drink-region Tneed Bfor all j 6= i and b 2 (need \ B ) ; bottlesenqueue request(b) in msgsj ]iijsend (m j ) for all j 6= i, m 2 frequest(b) token(b) demand(b) : b 2 B \ B gPrecondition: m is at head of msgsj ]Eect: dequeue m from msgsj ]iijtryPrecondition: dine-region = Rdrink-region = TEect: dine-region TiFigure 22.3: Code for the modular drinking philosophers algorithm|part I.284Actions for process i (cont.):Receive (request(b) j ) for all j 6= i, b 2 B \ BEect: if (b 2 need) and (drink-region 2 fT C g) thendeferred deferred fbgelseenqueue token(b) in msgsj ]bottles bottles ;fbgiijcritEect: dine-region Cif drink-region = T thenfor all j 6= i and b 2 (need \ B ) ; bottles: enqueue demand(b)) in msgsj ]current trueijReceive (demand(b) j ) for all j 6= i, b 2 B \ BEect: if (b 2 bottles) and ((b 62 need ) or (drink-region 6= C )) thenenqueue token(b) in msgsj ]bottles bottles ;fbgdeferred deferred ;fbgiijReceive (token(b) j ) for all j 6= i, b 2 B \ BEect: bottles bottles fbgiijcrit (B ) for all B BPrecondition: drink-region = TB = needneed bottlesif dine-region= C then current = trueEect: drink-region Ccurrent falseiiexitPrecondition: dine-region = Cif drink-region = T then current = falseiEect: dine-region EFigure 22.4: Code for the modular drinking philosophers algorithm|part II.285Actions for process i (cont.):remEect: dine-region Rexit (B ) for all B BEect: drink-region Efor all j 6= i and b 2 deferred \ B :enqueue token(b) in msgsj ]bottles bottles ; deferreddeferred iiijrem (B ) for all B BPrecondition: drink-region = EB = needEect: drink-region RiiFigure 22.5: Code for the modular drinking philosophers algorithm|part III.We remark that the complexity bounds depend closely on the costs of the underlyingdining algorithm.22.3 Stable Property DetectionIn this section we consider a new problem.
Suppose we have some system of I/O automata asusual, and suppose we want to superimpose on this system another algorithm that \monitors"the given algorithm somehow. For instance, the monitoring algorithm can detect when the given system has terminated execution, check for violation of some global invariants, detect some kind of deadlock where processes are waiting for each other (in a cycle) todo something, compute some global quantity (e.g., the total amount of money).We shall see some powerful techniques to solve such problems.28622.3.1 Termination for Diusing ComputationsFirst, we consider the termination problem alone. Dijkstra-Scholten gave an ecient algorithm, for the special case where the computation originates at one node.
In this kind ofcomputation, called diusing computation, all nodes are originally quiescent (dened here tomean that they are not enabled to do any locally controlled actions). The algorithm startswith an input from the outside world at one of the nodes, and we assume that no furtherinputs occur. According to the IOA denitions, once a node receives such a message, it canchange state in such a way that it now is enabled to do some locally-controlled actions, e.g.,to send messages to other nodes. (The nodes can also do various outputs to the outsideworld.) When other nodes receive these messages, they in turn might also become enabledto do some locally-controlled actions, and so on.The termination detection problem is formulated as follows.If the entire system ever reaches a quiescent state (i.e., all the processes arein quiescent states and there are no messages in transit), then eventually theoriginator node outputs a special message done .Of course, the original algorithm does not have such an output.
The new system constructed has to be an augmentation of the given system, where each node automaton isobtained from the old automaton in some allowed way.Below we state informally what is allowed to solve this problem. The precise denitionshould be similar to the superposition denition described in Chandy and Misra's Unitywork. We may add new state components, but the projection of the start states of the augmented machine must be exactly the start states of the original machine. The action set can be augmented with new inputs, outputs and internal actions. Theold actions must remain with the same preconditions, but they may have new eectson the new components they might also have additional information piggybacked onthem, e.g., send (m) now does send (m c).
The old actions remain in the same classes. The new input actions aect the new components only. The new output and internal actions can be preconditioned on the entire state (old andnew components), but they may only aect the new components.
They get groupedinto new fairness classes.The idea in the Dijkstra-Scholten termination detection algorithm is to superimpose akind of spanning-tree construction on the existing work of the underlying algorithm. Starting from the initial node, every message of the underlying algorithm gets a tree message287piggybacked upon it.
Just as in the asynchronous algorithm for constructing a spanningtree, each node records the rst receipt of a tree message in its state, as its parent. Any subsequent receipts of tree messages are immediately acknowledged (they are not discarded asin the spanning tree protocol: explicit acknowledgment is sent back). Only the rst one getsheld and unacknowledged (for now). Also, if the initial node ever receives a tree message, itimmediately acknowledges it. So as the messages circulate around the network, a spanningtree of the nodes involved in the protocol gets set up.Now, we are going to let this tree converge back upon itself (as in convergecast), in orderto report termination back to the initial node.
Specically, each node looks for when bothof the following hold at the same time: (a) its underlying algorithm is in a quiescent state,and (b) all its outgoing tree messages have been acknowledged. When it nds this, it \closesup the shop" | it sends an acknowledgment to its parent, and forgets everything aboutthis protocol. Note that this is dierent from what had to happen for ordinary broadcastconvergecast. There, we had to allow the nodes to remember that they had participated inthe algorithm, marking themselves as \done", so they didn't participate another time.
(Ifwe hadn't, the nodes would just continue broadcasting to each other repeatedly.) This time,however, we do allow the nodes to participate any number of times whether or not they doso is controlled by where the messages of the underlying algorithm get sent.Example. Suppose the nodes are 0 1 2 3, and consider the following scenario (see Figure22.6 for illustration).
(i) Node 0 wakes up, sends a message to 1 and sends a message to 2.(ii) Nodes 1 and 2 receive the messages, and set their parent pointers to point to node 0.Then they both wake up and send messages to each other, but since each node already hasa parent, it will just send an ack. Then both nodes send messages to 3. (iii) Suppose 1'smessage gets to node 3 rst then 1 is set to be the parent of 3, and therefore 3 acknowledges2 immediately. The four nodes continue for a while sending messages to each other theyimmediately acknowledge each message.Now suppose the basic algorithm at 1 quiesces. Still, 1 does not close up because it has anunacknowledged message to 3.
(iv) Suppose 3 now quiesces it then is ready to acknowledgeto 1, and it forgets it ever participated in the algorithm (actually, it forgets everything).When 1 receives this ack, and since 1 is also done, it sends ack to 0 and forgets everything.(v) Now suppose 2 sends out messages to 1 and 3.
When they receive these messages, theywake up as at the beginning (continue carrying out the basic algorithm) and reset theirparent pointers, this time to 2.This execution can continue in this fashion indenitely. But if all the nodes ever quiesce, and there are no messages of the basic algorithm in transit, then the nodes will allsucceed in converging back to the initial node, in this case 0. When 0 is quiesced and has288(i)(ii)(iii)222030311(v)2230131(iv)0031Figure 22.6: Example of an execution of the termination detection algorithm.
The arrows onthe edges indicate messages in transit, and the arrows parallel to the edges indicate parentpointers.acknowledgments for all its outgoing messages, it can announce termination.Suppose the underlying algorithm is Ai. In Figures 22.7 and 22.8 we give code for thesuperimposed version pi .We have argued roughly above that if the basic algorithm terminates, this algorithmeventually announces it.
We can also argue that this algorithm never does announce termination unless the underlying system has actually terminated. We can prove this using thefollowing invariant (cf. homework).Invariant 9 All non-idle nodes form a tree rooted at the node with status \leader", with thetree edges given by \parent" pointers.Francez-Shavit have generalized this algorithm to the case where the computation caninitiate at several locations (the idea there is to use several trees, and wait for them all toterminate).Complexity.
The number of messages is proportional to the number of messages ofunderlying algorithm. A nice property of this algorithm is its \locality" | if the diusingcomputation only operates for a short time in a small area of the network, the terminationprotocol only incurs proportionately small costs. On the other hand, if the algorithm worksfor a very long time before terminating, then the termination detection algorithm requiresconsiderable overhead.289New state components:status 2 fidle leader non-leaderg, initially idleparent , a node or nil, initially nilfor each neighbors j ,send (j ), a queue of messages, initially emptydecit (j ), an integer, initially 0Actions: , any input action of node i excluding a receive)New Eect:status leadersend (m)New Eect: decit (j ) decit (j ) + 1i jNew action receive (ack )New Eect: decit (j ) decit (j ) ; 1j ireceive (m), where m a message of ANew Eect: if status = idle thenstatus non-leaderparent jelse add ack to send (j )j iiNew action close-up-shop(i)Precondition: status = non-leaderj = parent (i)send (k) = for all kdecit (k) = 0 for all kstate of A is quiescentEect: add ack to sendstatus idleparent nilijFigure 22.7: Code for Dijkstra-Scholten termination detection algorithm | part I.290Actions (cont.):New action send (ack )Precondition: ack rst on send (j )Eect: remove rst message from send (j )i jNew action donePrecondition: status = leadersend (k) = for all kdecit (k) = 0 for all kstate of A is quiescentEect: status idleiFigure 22.8: Code for Dijkstra-Scholten termination detection algorithm | part II.22.3.2 SnapshotsProblem StatementThe snapshot problem is to obtain a consistent global state of a running distributed algorithm.