Introduction to Algorithms. (2nd edition) Thomas Cormen_ Charles Leiserson_ Ronad Rivest (811417), страница 49
Текст из файла (страница 49)
The procedure considers the three cases shown in Figure 12.4. If z has nochildren, we modify its parent p[z] to replace z with NIL as its child. If the node has only asingle child, we "splice out" z by making a new link between its child and its parent. Finally,if the node has two children, we splice out z's successor y, which has no left child (seeExercise 12.2-5) and replace z's key and satellite data with y's key and satellite data.Figure 12.4: Deleting a node z from a binary search tree. Which node is actually removeddepends on how many children z has; this node is shown lightly shaded. (a) If z has nochildren, we just remove it. (b) If z has only one child, we splice out z.
(c) If z has twochildren, we splice out its successor y, which has at most one child, and then replace z's keyand satellite data with y's key and satellite data.The code for TREE-DELETE organizes these three cases a little differently.TREE-DELETE(T, z)1 if left[z] = NIL or right[z] = NIL2then y ← z3else y ← TREE-SUCCESSOR(z)4 if left[y] ≠ NIL5then x ← left[y]6else x ← right[y]7 if x ≠ NIL8then p[x] ← p[y]9 if p[y] = NIL10then root[T] ← x11else if y = left[p[y]]12then left[p[y]] ← x13else right[p[y]] ← x14 if y ≠ z15then key[z] ← key[y]16copy y's satellite data into z17 return yIn lines 1–3, the algorithm determines a node y to splice out.
The node y is either the inputnode z (if z has at most 1 child) or the successor of z (if z has two children). Then, in lines 4–6, x is set to the non-NIL child of y, or to NIL if y has no children. The node y is spliced out inlines 7–13 by modifying pointers in p[y] and x. Splicing out y is somewhat complicated by theneed for proper handling of the boundary conditions, which occur when x = NIL or when y isthe root. Finally, in lines 14–16, if the successor of z was the node spliced out, y's key andsatellite data are moved to z, overwriting the previous key and satellite data.
The node y isreturned in line 17 so that the calling procedure can recycle it via the free list. The procedureruns in O(h) time on a tree of height h.In summary, we have proved the following theorem.Theorem 12.3The dynamic-set operations INSERT and DELETE can be made to run in O(h) time on abinary search tree of height h.Exercises 12.3-1Give a recursive version of the TREE-INSERT procedure.Exercises 12.3-2Suppose that a binary search tree is constructed by repeatedly inserting distinct values into thetree.
Argue that the number of nodes examined in searching for a value in the tree is one plusthe number of nodes examined when the value was first inserted into the tree.Exercises 12.3-3We can sort a given set of n numbers by first building a binary search tree containing thesenumbers (using TREE-INSERT repeatedly to insert the numbers one by one) and thenprinting the numbers by an inorder tree walk.
What are the worst-case and best-case runningtimes for this sorting algorithm?Exercises 12.3-4Suppose that another data structure contains a pointer to a node y in a binary search tree, andsuppose that y's predecessor z is deleted from the tree by the procedure TREE-DELETE.What problem can arise? How can TREE-DELETE be rewritten to solve this problem?Exercises 12.3-5Is the operation of deletion "commutative" in the sense that deleting x and then y from abinary search tree leaves the same tree as deleting y and then x? Argue why it is or give acounterexample.Exercises 12.3-6When node z in TREE-DELETE has two children, we could splice out its predecessor ratherthan its successor.
Some have argued that a fair strategy, giving equal priority to predecessorand successor, yields better empirical performance. How might TREE-DELETE be changedto implement such a fair strategy?12.4Randomly built binary search treesWe have shown that all the basic operations on a binary search tree run in O(h) time, where his the height of the tree. The height of a binary search tree varies, however, as items areinserted and deleted. If, for example, the items are inserted in strictly increasing order, the treewill be a chain with height n - 1.
On the other hand, Exercise B.5-4 shows that h≥ ⌊lg n⌋. Aswith quicksort, we can show that the behavior of the average case is much closer to the bestcase than the worst case.Unfortunately, little is known about the average height of a binary search tree when bothinsertion and deletion are used to create it. When the tree is created by insertion alone, theanalysis becomes more tractable. Let us therefore define a randomly built binary search treeon n keys as one that arises from inserting the keys in random order into an initially emptytree, where each of the n! permutations of the input keys is equally likely. (Exercise 12.4-3asks you to show that this notion is different from assuming that every binary search tree on nkeys is equally likely.) In this section, we shall show that the expected height of a randomlybuilt binary search tree on n keys is O(lg n).
We assume that all keys are distinct.We start by defining three random variables that help measure the height of a randomly builtbinary search tree. We denote the height of a randomly built binary search on n keys by Xn,. When we build a binary search tree on n keys,and we define the exponential heightwe choose one key as that of the root, and we let Rn denote the random variable that holds thiskey's rank within the set of n keys. The value of Rn is equally likely to be any element of theset {1, 2, ..., n}. If Rn = i, then the left subtree of the root is a randomly built binary search treeon i - 1 keys, and the right subtree is a randomly built binary search tree on n - i keys.
Becausethe height of a binary tree is one more than the larger of the heights of the two subtrees of theroot, the exponential height of a binary tree is twice the larger of the exponential heights ofthe two subtrees of the root. If we know that Rn = i, we therefore have thatYn = 2 · max(Yi-1, Yn-i).As base cases, we have Y1 = 1, because the exponential height of a tree with 1 node is 20 = 1and, for convenience, we define Y0 = 0.Next we define indicator random variables Zn,1, Zn,2, ..., Zn,n, whereZn,i = I{Rn = i}.Because Rn is equally likely to be any element of {1, 2, ..., n}, we have that Pr{Rn = i} = 1/nfor i = 1,2, ..., n, and hence, by Lemma 5.1,(12.1)for i = 1, 2, ..., n.
Because exactly one value of Zn,i is 1 and all others are 0, we also haveWe will show that E[Yn] is polynomial in n, which will ultimately imply that E[Xn] = O(lg n).The indicator random variable Zn,i = I{Rn = i} is independent of the values of Yi-1 and Yn-i.Having chosen Rn = i, the left subtree, whose exponential height is Yi-1, is randomly built onthe i - 1 keys whose ranks are less than i. This subtree is just like any other randomly builtbinary search tree on i - 1 keys. Other than the number of keys it contains, this subtree'sstructure is not affected at all by the choice of Rn = i; hence the random variables Yi-1 and Zn,iare independent.
Likewise, the right subtree, whose exponential height is Yn-i, is randomlybuilt on the n - i keys whose ranks are greater than i. Its structure is independent of the valueof Rn, and so the random variables Yn-i and Zn,i are independent. Hence,Each term E[Y0], E[Y1], ..., E[Yn-1] appears twice in the last summation, once as E[Yi-1] andonce as E[Yn-i], and so we have the recurrence(12.2)Using the substitution method, we will show that for all positive integers n, the recurrence(12.2) has the solutionIn doing so, we will use the identity(12.3)(Exercise 12.4-1 asks you to prove this identity.)For the base case, we verify that the boundholds.
For the substitution, we have thatWe have bounded E[Yn], but our ultimate goal is to bound E[Xn]. As Exercise 12.4-4 asks youto show, the function f(x) = 2x is convex (see page 1109). Therefore, we can apply Jensen'sinequality (C.25), which says thatto derive thatTaking logarithms of both sides gives E[Xn] = O(lg n). Thus, we have proven the following:Theorem 12.4The expected height of a randomly built binary search tree on n keys is O(lg n).Exercises 12.4-1Prove equation (12.3).Exercises 12.4-2Describe a binary search tree on n nodes such that the average depth of a node in the tree isΘ(lg n) but the height of the tree is w(lg n). Give an asymptotic upper bound on the height ofan n-node binary search tree in which the average depth of a node is Θ(lg n).Exercises 12.4-3Show that the notion of a randomly chosen binary search tree on n keys, where each binarysearch tree of n keys is equally likely to be chosen, is different from the notion of a randomlybuilt binary search tree given in this section.
(Hint: List the possibilities when n = 3.)Exercises 12.4-4Show that the function f(x) = 2x is convex.Exercises 12.4-5:Consider RANDOMIZED-QUICKSORT operating on a sequence of n input numbers. Provethat for any constant k > 0, all but O(1/nk) of the n! input permutations yield an O(n lg n)running time.Problems 12-1: Binary search trees with equal keysEqual keys pose a problem for the implementation of binary search trees.a. What is the asymptotic performance of TREE-INSERT when used to insert n itemswith identical keys into an initially empty binary search tree?We propose to improve TREE-INSERT by testing before line 5 whether or not key[z] = key[x]and by testing before line 11 whether or not key[z] = key[y].If equality holds, we implementone of the following strategies.
For each strategy, find the asymptotic performance ofinserting n items with identical keys into an initially empty binary search tree. (The strategiesare described for line 5, in which we compare the keys of z and x. Substitute y for x to arriveat the strategies for line 11.)b. Keep a boolean flag b[x] at node x, and set x to either left[x] or right[x] based on thevalue of b[x], which alternates between FALSE and TRUE each time x is visitedduring insertion of a node with the same key as x.c. Keep a list of nodes with equal keys at x, and insert z into the list.d. Randomly set x to either left[x] or right[x]. (Give the worst-case performanceand informally derive the average-case performance.)Problems 12-2: Radix treesGiven two strings a = a0a1...ap and b = b0b1...bq, where each ai and each bj is in some orderedset of characters, we say that string a is lexicographically less than string b if either1.