Distributed Algorithms. Nancy A. Lynch (1993) (811416), страница 55
Текст из файла (страница 55)
When theoperation arrives at the simulated location, it gets performed (possibly being queued upbefore being performed), and a response is sent back. When the response is received bythe sending process, it completes its step. If new inputs arrive from the outside world at aprocess during a \waiting period", the process only queues them. These pending inputs areprocessed only when the anticipated response returns.12 We now claim that this simulation iscorrect, i.e., that every external behavior is also an external behavior of the instantaneouslyshared memory system being simulated also, every fair behavior (of nodes and channels) isalso a fair behavior of the instantaneous shared memory system also, some kind of waitfreedom carries over.
To prove this claim, we use the corresponding result for atomic sharedmemory simulating instantaneous shared memory, and the observation that we are actuallysimulating atomic objects using a centralized object and the delays in the message system.We omit details.The problem of where to put the shared variables depends on the characteristics of theapplication at hand. E.g., for a system based on single-writer multi-reader shared variables,where writes are fairly frequent, we would expect to locate the variables at the writer's node.Earlier in the course we had a restriction for the atomic shared memory simulation of instantaneousshared memory, that no new inputs arrive at any node while the node process is waiting for an invocationto return.
The reason for this was to avoid introducing any new interleavings. It seems, however, that wecould have handled this by just forcing the process to queue up the inputs as above, and so remove the needfor this restriction.12270Note that the objects have to be simulated even while the nodes don't have active requests.There are other ways of simulating shared memory algorithms.Caching. For single-writer multi-reader registers, note that someone who wants to repeatedly test the value of a register has, in the implementation described above, to send repeatedmessages even though the variable is not changing.
It is possible to develop an alternativestrategy by which writers notify readers before they change the value of the variable, so ifthe readers haven't heard about a modication, they can just use their previous value. Thisis the basic idea of caching.Replicated copies. Caching is one example of a general style of implementation based ondata replication. Data can be replicated for many reasons, e.g., fault-tolerance (which wewill discuss later). But often it is done only for availability.
Suppose we have a multi-writermulti-reader shared register, in which writes are very infrequent. Then we can locate copiesof the register only at all the readers' nodes (compare with the caching example). A reader,to read the register, can always look at its local copy. A writer, to write the register, needs towrite all the copies. Note that it needs to do this atomically (since one at a time may causeout-of-order reads). Therefore we need some extra protocol here to ensure that the copiesare written as if atomically. This requires techniques from the area of database concurrencycontrol.21.3 Mutual Exclusion and Resource AllocationIn this section we consider the problem of resource allocation in a distributed message-passingnetwork.21.3.1 Problem DenitionWe have the same interface as before, but now the inputs and outputs for user i occur ata corresponding node i (see Figure 21.1).
The processes, one for each i, communicate viamessages over FIFO channels.Consider the resource requirement formulation of the problem (in terms of a monotoneBoolean formula involving the needed resources). Each node i has a static (xed) resourcerequirement. Assume that any two nodes that have any common resource in their tworesource requirements are neighbors in the network graph, so they can communicate tonegotiate priority for the resources. We do not model the resources separately.In this setting, we typically drop the restriction that nodes can perform locally-controlledsteps only when they are between requests and responses.
This is because the algorithms271trycritexitremtrycritexitremtrycritexitremFigure 21.1: Interface specication for resource allocation on a message passing systemwill typically need to respond to requests by their neighbors for the resources.Th correctness conditions are now as follows. As before, we require a solution to preservewell-formedness (cyclic behavior), and also mutual exclusion or a more general exclusioncondition.We also require deadlock-freedom, i.e., if there is any active request and no one in C ,then some request eventually gets granted, and if there is any active exit region then someexit eventually occurs.
This time the hypothesis is that all the node and channel automataexhibit fair behavior (in the IOA sense, i.e., all the nodes keep taking steps and all thechannels continue to deliver all messages).The lockout-freedom condition is dened as before, under the hypothesis that all the nodeand channel automata exhibit fair behavior.For the concurrency property, suppose that a request is invoked at a node i, and allneighbors are in R and remain in R.
Suppose that execution proceeds so that i and all its272neighbors continue to take steps fairly, and the connecting links continue to operate fairly.(Note that other processes can remain in C forever.) Then the concurrency condition requiresthat eventually i reach C .We remark that this is analogous to the weak condition we had for shared memorysystems, only here we add fairness requirements on the neighbors and the links in between.21.3.2 Mutual Exclusion AlgorithmsRaynal's book is a good reference (also for the shared memory mutual exclusive algorithms).We have many possible approaches now.Simulating Shared MemoryWe have learned about several shared memory algorithms for mutual exclusion. We cansimply simulate one of these in a distributed system.
E.g., use the Peterson multi-writermulti-reader shared register tournament algorithm (or the variant of this that we didn't havetime to cover), or some version of the bakery algorithm (maybe with unbounded tickets).LeLannLeLann proposed the following simple solution. The processes are arranged in a logical ringp1 p2 : : : pn p1 . A token representing control of the resource is passed around the ring inorder. When process pi receives the token, it checks for an outstanding request for theresource from user i. If there is no such request, the token is passed to the next process inthe ring. If there is an outstanding request, the resource is granted and the token is helduntil the resource is returned and then passed to the next process.The code for process pi is given in Figure 21.2.Let us go over the properties of LeLann's algorithm.Mutual Exclusion is guaranteed in normal operation because there is only one token, andonly its holder can have any resource.Deadlock-Freedom: in fair executions, when no one is in C , the process that holds thetoken is either: in T , and then it can go to C , or in E R, and then it has to pass the token to the next process.Similarly, the deadlock-freedom for the exit region is satised.Lockout-Freedom is straightforward.273Local variables: token 2 fnone available in use used g, initially available at p1, and none elsewhere region 2 fR T C E g, initially RActions:try iEect: region Tcrit iPrecondition: region = Ttoken = availEect: region Ctoken in useexit iEect: region Erem iPrecondition: region = EEect: region Rtoken usedreceive i;1 i(token )Eect: token availablesend i i+1 (token )Precondition: token = usedEect: token noneW(token = availableVregion 6= T )Figure 21.2: LeLann mutual exclusion algorithm for message passing systems274Performance: First, let's consider the number of messages.
It is not clear what tomeasure, since the messages aren't naturally apportioned to particular requests. E.g., wecan measure the worst-case number of messages sent in between a try i and correspondingcrit i , but it seems more reasonable to try to do some kind of amortized analysis. Note thateach virtual link in the logical ring is really of unit length, since the graph is complete.In the worst case, n messages are sent between tryi and criti. For the amortized cost, weconsider the case of \heavy load", where there is always an active request at each node inthis case there are only a constant number of messages per request.
If a request comes inalone, however, we still get n messages in the worst case. Also, it is hard to get a goodbound statement here, since messages are sent even in the absence of any requests at all.For the time complexity, we assume worst-case bounds. Let c be the time spent in C , dbe the message delay for the rst message in any channel, and s be the process step time.The worst-case time is approximately (c + d + O(s)) n.