Distributed Algorithms. Nancy A. Lynch (1993) (811416), страница 69
Текст из файла (страница 69)
The heavy arrows representparent pointers, and the dashed arrows represent messages in transit. In (c), none of thenodes is in Ready mode. In (d), some of the nodes are in Ready mode, and some resetsignals are output.Abort, it broadcasts Abort messages to all its neighbors, sets the Ack Pend bits to truefor all incident links, and waits until all the neighboring nodes send back Ack. If a nodereceives Abort message while in Ready mode, it marks the link from which the messagearrived as its parent, broadcasts Abort, and waits for Acks to be received from all itsneighbors. If Abort message is received by a node not in a Ready mode, Ack is sent backimmediately. When a node receives Ack it sets the corresponding Ack Pend bit to false.The action of a node when it receives the last anticipated Ack depends on the value of itsparent. If parent 6= nil, its mode is changed to Converge, and Ack is sent on parent.If parent = nil, the mode is changed to Ready, the node broadcasts Ready messages,and outputs a reset signal.
A node that gets a Ready message on its parent link while inConverge mode, changes its mode to Ready, makes its parent nil, outputs a reset signal,332and broadcasts Ready.While the mode is not Ready, $-messages input by the user are discarded, and all $messages from a link e are queued on Buffere]. When Abort arrives on link e, Buffere]is ushed. While the mode is Ready, $-messages from the user to the network are put onthe output queue, and $-messages in Buffer are forwarded to the user. The size of Bufferdepends on the assumptions we make on the user.The actual code is given in Figures 24.3{24.6. A few remarks are in order.1.
In the self-stabilization model, we can deal with the assumption that the topology isn'txed in advance. This is done by assuming that links may be up or down, and thenodes are constantly informed of the status of their incident links. As usual, we requirecorrectness only after the system has stabilized. We model this by assuming that themaximal degree d is known (e.g., the number of ports), and that the state of the linksis maintained updated in the Edges array.2.
In the self-stabilization model, we tend to have as little variables as possible: less thingscan get corrupted. See, for example, the denition for mode (u) in Figure 24.3.The rst step in making the above protocol self-stabilizing is to make it locally checkable.A clear problem with the existing protocol is that it will deadlock if in the initial state someparent edges form a cycle. We mend this aw by maintaining distance variable at eachnode, such that a node's distance is one greater than that of its parent.
Specically,distance is initialized to 0 upon reset request, and its accumulated value is appended to theAbort messages. However, since all we care about is acyclicity, there is no need to updatedistance when a link goes down.Next, we list all the link predicates that are necessary to ensure correct operation ofthe Reset protocol. It turns out that all the required link predicates are independentlystable, and hence the stabilization time would be one time unit (each subsystem needs to becorrected at most once, independently)Our last step is to design a local correction action for links (the f of the local correctiontheorem).The main diculty about designing a correcting strategy is making it local, i.e., to ensurethat when we correct a link we do not violate predicates of incident link-subsystems.
For thecase of dynamic-topology network the solution is simple: emulate a link failure and recovery.Of course, care must be exerted when writing the code for these events (see code for localreset in Figure 24.3).In the code, the detection and correction mechanism is written explicitly (Figure 24.5).For every link, we have the ID of the node at the other endpoint the node with the higher333State of process u:array of d Boolean agsparent:pointer, ranging over 1::d] fnilgdistance: ranges over 0::N ], where N is a bound on the number of processesEdges:set of operational incident links (maintained by the links protocol)Queue:array of d send buersBuffer:array of d receive buersIDs:array of d node identiersTimers:array of d timersCheck:array of d Boolean agsSnap Mode: array of d Boolean agsAck Pend:Shorthand for the code of process u:8>><mode (u) = >>:Abortif 9e such that Ack Pende] = trueConverge if parent 6= nil and 8e (Ack Pende] = false)Readyif parent = nil and 8e (Ack Pende] = false)Propagate(e dist) if dist 6= 0 then parent e else parent nildistance distfor all edges e0 2 Edges doAck Pende0 ] trueenqueue Abort(distance + 1) on Queuee0 ]Local Reset(e) Buffere] Queuee] D1 : if 9e0 (e0 6= e and Ack Pende0 ] = true) or (e 6= parent and parent 6= nil) thenif parent = e then parent nilif Ack Pende] = true thenAck Pende] falseif mode (u) = Converge then enqueue Ack in Queueparent]D2 : else if mode (u) 6= ReadyAck Pende] falseparent niloutput reset signalfor all e0 2 Edges do enqueue Ready in Queuee0]Figure 24.3: Self-stabilizing334 reset protocol | part I.Code for process u:Whenever reset request and mode (u) = ReadyQ : Propagate(nil 0)Whenever Abort(dist) on link eA1 : if mode (u) = Ready thenPropagate(e dist)A2 : if (dist = 0) or (mode 6= Ready) enqueue Ack in Queuee]Whenever Ack on link e and Ack Pende] = trueAck Pende] falseK1 : if mode (u) = Converge thenenqueue Ack in Queueparent]K2 : else if mode (u) = Ready thenoutput reset signalfor all e0 2 Edges doenqueue Ready in Queuee0 ]Whenever Ready on link e and parent = eR : if mode (u) = Converge thenparent niloutput reset signalfor all edges e0 2 Edges doenqueue Ready in Queuee0 ]Whenever Send (m e)if mode (u) = Ready thenenqueue m in Queuee]Whenever Queuee] 6= send head (Queuee]) over edelete head (Queuee])Figure 24.4: Self-stabilizing reset protocol | part II.335Code for process u (cont.):Whenever m 2 received from link eif Checke] = true thenif Snap Modee] = snap thenrecord m as part of snapshot state on link eenqueue m in Buffere]else enqueue m in Buffere]Whenever Buffere] 6= and mode (u) = Readyoutput Receive (head (Buffere]) e)delete head (Buffere])Whenever Abort on e:Buffere] Whenever Down on e:execute Local Reset(e)Whenever Timerse] expires for some link e:if My ID < IDse] thenChecke] trueif Snap Modee] = snap thenrecord local stateelse execute Local Reset(e)send Start Snap(Snap Modee] My ID) on eWhenever Start Snap(b id) on link eIDse] idif id < My ID thenif b = reset thenexecute Local Reset(e)send Response Snap(local state)Whenever Response Snap(S ) on link eChecke] falseif Snap Modee] = snap and any invariant does not hold thenSnap Modee] resetelse execute Local Reset(e)336Figure 24.5: Self-stabilizing reset protocol | part III.A:e] = true i one of the following holds.Abort(distance + 1) 2 Queue e]mode (v ) = Abort and parent = eAck 2 Queue e]Ack Pendu uuvvB: At most one of the following hold.Abort(distance + 1) 2 Queue e]mode (v ) = Abort and parent = eAck 2 Queue e]uuvvvC:= e implies that mode (u) = Converge i one of the following holds.Ack 2 Queue e]Ack Pend e] = false and mode (v ) 6= ReadyReady 2 Queue e]parentuuvvD:= e implies that at most one of the following holds.Ack 2 Queue e]Ack Pend e] = false and mode (v ) 6= ReadyReady 2 Queue e]parentuuvvR: tail (Queue e]) 2 fReadyg impliesumode (u) = ReadyP:= e impliesone of the following holds.distance = distance + 1Ready 2 Queue e]parentuuvvE : e 2= Edges implies all the following.Ack Pend e] = falseparent =6 eQueue e] = Buffer e] = uuuuuQ:e] is a subsequence of the following.QueuehAbort AckihAck Ready AbortihAck Ready iFigure 24.6: Reset protocol part IV: link invariants for link e = (u v).
The link (v u) isdenoted e.337ID is assumed to be \responsible" for the correctness of the subsystem. This is done asfollows. Whenever the timer dedicated to that link expires, the node checks whether it is\in charge" of that link if so, it sends out a snapshot message that returns with the stateof the other node. Based on this state, its own state, and the messages received, it can ndout whether any of the invariants (Figure 24.6) is violated. If this is the case, local reset isexecuted in a coordinated way that resembles the way snapshots of the link are taken.The invariants of Figure 24.6 express simple properties that turn out to be sucient toensure correct operation of the protocol.
Invariants A and B capture the desired behavior afabort messages C and D deal with the converge messages analogously The ready messageis simpler (since we don't expect anything from a node in ready mode), and is described byinvariant R. Invariant P is crucial to ensure stabilization: it guarantees that no cycles of theparent pointers can survive checking and correction. Invariant E describes what is expectedfrom a non-existing edge.
(The goal of local reset action brings the subsystem to that state.)The last invariant, Q, is a technicality: we need to consider the messages in the send queuealso.24.3.5 AnalysisWe'll sketch the outlines of the proofs, just to show the ideas. Details are a bit messy.We start with the statement of stabilization.Theorem 1 Suppose that the links are bounded-delay links with delay time D, and supposethat invariants are veried by the manager at least every P time units. Then any executionof the Reset protocol, regardless of the initial state, in all states from time O(D + P ) onwards,all the invariants hold for all the links.This theorem is proven by showing that the invariants are stable.