Introduction to Algorithms. (2nd edition) Thomas Cormen_ Charles Leiserson_ Ronad Rivest (811417), страница 88
Текст из файла (страница 88)
The minimum degree for this B-tree is t = 3, so anode (other than the root) cannot have fewer than 2 keys. Nodes that are modified are lightlyshaded. (a) The B-tree of Figure 18.7(e). (b) Deletion of F. This is case 1: simple deletionfrom a leaf. (c) Deletion of M.
This is case 2a: the predecessor L of M is moved up to take M'sposition. (d) Deletion of G. This is case 2c: G is pushed down to make node DEGJK, and thenG is deleted from this leaf (case 1). (e) Deletion of D. This is case 3b: the recursion can'tdescend to node CL because it has only 2 keys, so P is pushed down and merged with CL andTX to form CLPTX; then, D is deleted from a leaf (case 1). (e′) After (d), the root is deletedand the tree shrinks in height by one. (f) Deletion of B.
This is case 3a: C is moved to fill B'sposition and E is moved to fill C's position.1. If the key k is in node x and x is a leaf, delete the key k from x.2. If the key k is in node x and x is an internal node, do the following.a. If the child y that precedes k in node x has at least t keys, then find thepredecessor k′ of k in the subtree rooted at y. Recursively delete k′, and replacek by k′ in x. (Finding k′ and deleting it can be performed in a single downwardpass.)b. Symmetrically, if the child z that follows k in node x has at least t keys, thenfind the successor k′ of k in the subtree rooted at z.
Recursively delete k′, andreplace k by k′ in x. (Finding k′ and deleting it can be performed in a singledownward pass.)c. Otherwise, if both y and z have only t - 1 keys, merge k and all of z into y, sothat x loses both k and the pointer to z, and y now contains 2t - 1 keys. Then,free z and recursively delete k from y.3. If the key k is not present in internal node x, determine the root ci[x] of the appropriatesubtree that must contain k, if k is in the tree at all. If ci[x] has only t - 1 keys, executestep 3a or 3b as necessary to guarantee that we descend to a node containing at least tkeys.
Then, finish by recursing on the appropriate child of x.a. If ci[x] has only t - 1 keys but has an immediate sibling with at least t keys,give ci[x] an extra key by moving a key from x down into ci[x], moving a keyfrom ci[x]'s immediate left or right sibling up into x, and moving theappropriate child pointer from the sibling into ci[x].b. If ci[x] and both of ci[x]'s immediate siblings have t - 1 keys, merge ci[x] withone sibling, which involves moving a key from x down into the new mergednode to become the median key for that node.Since most of the keys in a B-tree are in the leaves, we may expect that in practice, deletionoperations are most often used to delete keys from leaves.
The B-TREE-DELETE procedurethen acts in one downward pass through the tree, without having to back up. When deleting akey in an internal node, however, the procedure makes a downward pass through the tree butmay have to return to the node from which the key was deleted to replace the key with itspredecessor or successor (cases 2a and 2b).Although this procedure seems complicated, it involves only O(h) disk operations for a B-treeof height h, since only O(1) calls to DISK-READ and DISK-WRITE are made betweenrecursive invocations of the procedure.
The CPU time required is O(th) = O(t logt n).Exercises 18.3-1Show the results of deleting C, P, and V , in order, from the tree of Figure 18.8(f).Exercises 18.3-2Write pseudocode for B-TREE-DELETE.Problems 18-1: Stacks on secondary storageConsider implementing a stack in a computer that has a relatively small amount of fastprimary memory and a relatively large amount of slower disk storage.
The operations PUSHand POP are supported on single-word values. The stack we wish to support can grow to bemuch larger than can fit in memory, and thus most of it must be stored on disk.A simple, but inefficient, stack implementation keeps the entire stack on disk. We maintain inmemory a stack pointer, which is the disk address of the top element on the stack. If thepointer has value p, the top element is the (p mod m)th word on page ⌊p/m⌋ of the disk, wherem is the number of words per page.To implement the PUSH operation, we increment the stack pointer, read the appropriate pageinto memory from disk, copy the element to be pushed to the appropriate word on the page,and write the page back to disk. A POP operation is similar.
We decrement the stack pointer,read in the appropriate page from disk, and return the top of the stack. We need not write backthe page, since it was not modified.Because disk operations are relatively expensive, we count two costs for any implementation:the total number of disk accesses and the total CPU time. Any disk access to a page of mwords incurs charges of one disk access and Θ(m) CPU time.a. Asymptotically, what is the worst-case number of disk accesses for n stack operationsusing this simple implementation? What is the CPU time for n stack operations?(Express your answer in terms of m and n for this and subsequent parts.)Now consider a stack implementation in which we keep one page of the stack in memory.(We also maintain a small amount of memory to keep track of which page is currently inmemory.) We can perform a stack operation only if the relevant disk page resides in memory.If necessary, the page currently in memory can be written to the disk and the new page read infrom the disk to memory.
If the relevant disk page is already in memory, then no diskaccesses are required.b. What is the worst-case number of disk accesses required for n PUSH operations?What is the CPU time?c. What is the worst-case number of disk accesses required for n stack operations? Whatis the CPU time?Suppose that we now implement the stack by keeping two pages in memory (in addition to asmall number of words for bookkeeping).d. Describe how to manage the stack pages so that the amortized number of disk accessesfor any stack operation is O(1/m) and the amortized CPU time for any stack operationis O(1).Problems 18-2: Joining and splitting 2-3-4 treesThe join operation takes two dynamic sets S′ and S″ and an element x such that for any x′ S′and x″ S″, we have key[x′] < key[x] < key[x″].
It returns a set S = S′ {x} S″. The splitoperation is like an "inverse" join: given a dynamic set S and an element x S, it creates a setS′ consisting of all elements in S - {x} whose keys are less than key[x] and a set S″ consistingof all elements in S - {x} whose keys are greater than key[x]. In this problem, we investigatehow to implement these operations on 2-3-4 trees. We assume for convenience that elementsconsist only of keys and that all key values are distinct.a. Show how to maintain, for every node x of a 2-3-4 tree, the height of the subtreerooted at x as a field height[x].
Make sure that your implementation does not affect theasymptotic running times of searching, insertion, and deletion.b. Show how to implement the join operation. Given two 2-3-4 trees T′ and T″ and a keyk, the join should run in O(1 + |h′ - h″|) time, where h′ and h″ are the heights of T′ andT″, respectively.c. Consider the path p from the root of a 2-3-4 tree T to a given key k, the set S′ of keysin T that are less than k, and the set S″ of keys in T that are greater than k.
Show that pand a set of keyswhere,breaks S′ into a set of treesfor any keysand. What is thefor i = 1, 2, ..., m, we haverelationship between the heights of and ? Describe how p breaks S″ into sets oftrees and keys.d. Show how to implement the split operation on T. Use the join operation to assemblethe keys in S′ into a single 2-3-4 tree T′ and the keys in S″ into a single 2-3-4 tree T″.The running time of the split operation should be O(lg n), where n is the number ofkeys in T.
(Hint: The costs for joining should telescope.)Chapter notesKnuth [185], Aho, Hopcroft, and Ullman [5], and Sedgewick [269] give further discussions ofbalanced-tree schemes and B-trees. Comer [66] provides a comprehensive survey of B-trees.Guibas and Sedgewick [135] discuss the relationships among various kinds of balanced-treeschemes, including red-black trees and 2-3-4 trees.In 1970, J. E. Hopcroft invented 2-3 trees, a precursor to B-trees and 2-3-4 trees, in whichevery internal node has either two or three children. B-trees were introduced by Bayer andMcCreight in 1972 [32]; they did not explain their choice of name.Bender, Demaine, and Farach-Colton [37] studied how to make B-trees perform well in thepresence of memory-hierarchy effects.
Their cache-oblivious algorithms work efficientlywithout explicitly knowing the data transfer sizes within the memory hierarchy.Chapter 19: Binomial HeapsOverviewThis chapter and Chapter 20 present data structures known as mergeable heaps, whichsupport the following five operations.••••MAKE-HEAP() creates and returns a new heap containing no elements.INSERT(H, x) inserts node x, whose key field has already been filled in, into heap H.MINIMUM(H) returns a pointer to the node in heap H whose key is minimum.EXTRACT-MIN(H) deletes the node from heap H whose key is minimum, returning apointer to the node.•UNION(H1, H2) creates and returns a new heap that contains all the nodes of heaps H1and H2.
Heaps H1 and H2 are "destroyed" by this operation.In addition, the data structures in these chapters also support the following two operations.••DECREASE-KEY(H, x, k) assigns to node x within heap H the new key value k,which is assumed to be no greater than its current key value.[1]DELETE(H, x) deletes node x from heap H.As the table in Figure 19.1 shows, if we don't need the UNION operation, ordinary binaryheaps, as used in heapsort (Chapter 6), work well. Operations other than UNION run in worstcase time O(lg n) (or better) on a binary heap. If the UNION operation must be supported,however, binary heaps perform poorly.