Introduction to Algorithms. (2nd edition) Thomas Cormen_ Charles Leiserson_ Ronad Rivest (811417), страница 97
Текст из файла (страница 97)
Consider a fixedobject x. We know that each time x's representative pointer was updated, x must have startedin the smaller set. The first time x's representative pointer was updated, therefore, the resultingset must have had at least 2 members. Similarly, the next time x's representative pointer wasupdated, the resulting set must have had at least 4 members. Continuing on, we observe thatfor any k ≤ n, after x's representative pointer has been updated ⌈lg k⌉ times, the resulting setmust have at least k members.
Since the largest set has at most n members, each object'srepresentative pointer has been updated at most ⌈lg n⌉ times over all the UNION operations.We must also account for updating the head and tail pointers and the list lengths, which takeonly Θ(1) time per UNION operation. The total time used in updating the n objects is thusO(n lg n).The time for the entire sequence of m operations follows easily. Each MAKE-SET and FINDSET operation takes O(1) time, and there are O(m) of them. The total time for the entiresequence is thus O(m + n lg n).Exercises 21.2-1Write pseudocode for MAKE-SET, FIND-SET, and UNION using the linked-listrepresentation and the weighted-union heuristic. Assume that each object x has an attributerep[x] pointing to the representative of the set containing x and that each set S has attributeshead[S], tail[S], and size[S] (which equals the length of the list).Exercises 21.2-2Show the data structure that results and the answers returned by the FIND-SET operations inthe following program.
Use the linked-list representation with the weighted-union heuristic.1234567891011for i ← 1 to 16do MAKE-SET(xi)for i ← 1 to 15 by 2do UNION(xi, xi+1)for i ← 1 to 13 by 4do UNION(xi, xi+2)UNION(x1, x5)UNION(x11, x13)UNION(x1, x10)FIND-SET(x2)FIND-SET(x9)Assume that if the sets containing xi and xj have the same size, then the operation UNION(xi,xj) appends xj's list onto xi's list.Exercises 21.2-3Adapt the aggregate proof of Theorem 21.1 to obtain amortized time bounds of O(1) forMAKE-SET and FIND-SET and O(lg n) for UNION using the linked-list representation andthe weighted-union heuristic.Exercises 21.2-4Give a tight asymptotic bound on the running time of the sequence of operations in Figure21.3 assuming the linked-list representation and the weighted-union heuristic.Exercises 21.2-5Suggest a simple change to the UNION procedure for the linked-list representation thatremoves the need to keep the tail pointer to the last object in each list.
Whether or not theweighted-union heuristic is used, your change should not change the asymptotic running timeof the UNION procedure. (Hint: Rather than appending one list to another, splice themtogether.)21.3 Disjoint-set forestsIn a faster implementation of disjoint sets, we represent sets by rooted trees, with each nodecontaining one member and each tree representing one set. In a disjoint-set forest, illustratedin Figure 21.4(a), each member points only to its parent.
The root of each tree contains therepresentative and is its own parent. As we shall see, although the straightforward algorithmsthat use this representation are no faster than ones that use the linked-list representation, byintroducing two heuristics-"union by rank" and "path compression"-we can achieve theasymptotically fastest disjoint-set data structure known.Figure 21.4: A disjoint-set forest. (a) Two trees representing the two sets of Figure 21.2.
Thetree on the left represents the set {b, c, e, h}, with c as the representative, and the tree on theright represents the set {d, f, g}, with f as the representative. (b) The result of UNION(e, g).We perform the three disjoint-set operations as follows. A MAKE-SET operation simplycreates a tree with just one node. We perform a FIND-SET operation by following parentpointers until we find the root of the tree. The nodes visited on this path toward the rootconstitute the find path. A UNION operation, shown in Figure 21.4(b), causes the root of onetree to point to the root of the other.Heuristics to improve the running timeSo far, we have not improved on the linked-list implementation.
A sequence of n - 1 UNIONoperations may create a tree that is just a linear chain of n nodes. By using two heuristics,however, we can achieve a running time that is almost linear in the total number of operationsm.The first heuristic, union by rank, is similar to the weighted-union heuristic we used with thelinked-list representation. The idea is to make the root of the tree with fewer nodes point tothe root of the tree with more nodes. Rather than explicitly keeping track of the size of thesubtree rooted at each node, we shall use an approach that eases the analysis.
For each node,we maintain a rank that is an upper bound on the height of the node. In union by rank, theroot with smaller rank is made to point to the root with larger rank during a UNIONoperation.The second heuristic, path compression, is also quite simple and very effective. As shown inFigure 21.5, we use it during FIND-SET operations to make each node on the find path pointdirectly to the root. Path compression does not change any ranks.Figure 21.5: Path compression during the operation FIND-SET. Arrows and self-loops atroots are omitted.
(a) A tree representing a set prior to executing FIND-SET(a). Trianglesrepresent subtrees whose roots are the nodes shown. Each node has a pointer to its parent. (b)The same set after executing FIND-SET(a). Each node on the find path now points directly tothe root.Pseudocode for disjoint-set forestsTo implement a disjoint-set forest with the union-by-rank heuristic, we must keep track ofranks.
With each node x, we maintain the integer value rank[x], which is an upper bound onthe height of x (the number of edges in the longest path between x and a descendant leaf).When a singleton set is created by MAKE-SET, the initial rank of the single node in thecorresponding tree is 0. Each FIND-SET operation leaves all ranks unchanged. Whenapplying UNION to two trees, there are two cases, depending on whether the roots have equalrank. If the roots have unequal rank, we make the root of higher rank the parent of the root oflower rank, but the ranks themselves remain unchanged. If, instead, the roots have equalranks, we arbitrarily choose one of the roots as the parent and increment its rank.Let us put this method into pseudocode.
We designate the parent of node x by p[x]. The LINKprocedure, a subroutine called by UNION, takes pointers to two roots as inputs.MAKE-SET(x)1 p[x] ← x2 rank[x] ← 0UNION(x, y)1LINK(FIND-SET(x), FIND-SET(y))LINK(x, y)1 if rank[x] > rank[y]2then p[y] ← x3else p[x] ← y4if rank[x] = rank[y]5then rank[y] ← rank[y] + 1The FIND-SET procedure with path compression is quite simple.FIND-SET(x)1 if x ≠ p[x]2then p[x] ← FIND-SET(p[x])3 return p[x]The FIND-SET procedure is a two-pass method: it makes one pass up the find path to find theroot, and it makes a second pass back down the find path to update each node so that it pointsdirectly to the root.
Each call of FIND-SET(x) returns p[x] in line 3. If x is the root, then line 2is not executed and p[x] = x is returned. This is the case in which the recursion bottoms out.Otherwise, line 2 is executed, and the recursive call with parameter p[x] returns a pointer tothe root. Line 2 updates node x to point directly to the root, and this pointer is returned in line3.Effect of the heuristics on the running timeSeparately, either union by rank or path compression improves the running time of theoperations on disjoint-set forests, and the improvement is even greater when the twoheuristics are used together. Alone, union by rank yields a running time of O(m lg n) (seeExercise 21.4-4), and this bound is tight (see Exercise 21.3-3).
Although we shall not prove ithere, if there are n MAKE-SET operations (and hence at most n - 1 UNION operations) and fFIND-SET operations, the path-compression heuristic alone gives a worst-case running timeof Θ (n + f · (1 + log2+ f / n n)).When we use both union by rank and path compression, the worst-case running time is O(m α(n)), where α(n) is a very slowly growing function, which we define in Section 21.4.
In anyconceivable application of a disjoint-set data structure, α(n) ≤ 4; thus, we can view therunning time as linear in m in all practical situations. In Section 21.4, we prove this upperbound.Exercises 21.3-1Do Exercise 21.2-2 using a disjoint-set forest with union by rank and path compression.Exercises 21.3-2Write a nonrecursive version of FIND-SET with path compression.Exercises 21.3-3Give a sequence of m MAKE-SET, UNION, and FIND-SET operations, n of which areMAKE-SET operations, that takes Ω(m lg n) time when we use union by rank only.Exercises 21.3-4: ⋆Show that any sequence of m MAKE-SET, FIND-SET, and LINK operations, where all theLINK operations appear before any of the FIND-SET operations, takes only O(m) time if bothpath compression and union by rank are used. What happens in the same situation if only thepath-compression heuristic is used?21.4Analysis of union by rank with path compressionAs noted in Section 21.3, the running time of the combined union-by-rank and pathcompression heuristic is O(m α (n)) for m disjoint-set operations on n elements.