Introduction to Algorithms. (2nd edition) Thomas Cormen_ Charles Leiserson_ Ronad Rivest (811417), страница 41
Текст из файла (страница 41)
Underthese assumptions, you will show that the following randomized algorithm can be used tosearch the list inexpected time.COMPACT-LIST-SEARCH(L, n, k)1 i ← head[L]2 while i ≠ NIL and key[i] < k3do j ← RANDOM(1, n)4if key[i] < key[ j] and key[j] ≤ k5then i ← j6if key[i] = k7then return i8i ← next[i]9 if i = NIL or key[i] > k10then return NIL11else return iIf we ignore lines 3–7 of the procedure, we have an ordinary algorithm for searching a sortedlinked list, in which index i points to each position of the list in turn. The search terminatesonce the index i "falls off" the end of the list or once key[i]≥ k.
In the latter case, if key[i] = k,clearly we have found a key with the value k. If, however, key[i] > k, then we will never find akey with the value k, and so terminating the search was the right thing to do.Lines 3–7 attempt to skip ahead to a randomly chosen position j. Such a skip is beneficial ifkey[j] is larger than key[i] and no larger than k; in such a case, j marks a position in the listthat i would have to reach during an ordinary list search. Because the list is compact, we knowthat any choice of j between 1 and n indexes some object in the list rather than a slot on thefree list.Instead of analyzing the performance of COMPACT-LIST-SEARCH directly, we shallanalyze a related algorithm, COMPACT-LIST-SEARC′, which executes two separate loops.This algorithm takes an additional parameter t which determines an upper bound on thenumber of iterations of the first loop.COMPACT-LIST-SEARC′ (L, n, k, t)1 i ← head[L]2 for q ← 1 to t3456789101112do j ← RANDOM(1, n)if key[i] < key[j] and key[j] ≤ kthen i ← jif key[i] = kthen return iwhile i ≠ NIL and key[i] < kdo i ← next[i]if i = NIL or key[i] > kthen return NILelse return iTo compare the execution of the algorithms COMPACT-LIST-SEARCH(L, k) andCOMPACT-LIST-SEARC′(L, k, t), assume that the sequence of integers returned by the callsof RANDOM(1, n) is the same for both algorithms.a.
Suppose that COMPACT-LIST-SEARCH(L, k) takes t iterations of the while loop oflines 2–8. Argue that COMPACT-LIST-SEARC′(L, k, t) returns the same answer andthat the total number of iterations of both the for and while loops within COMPACTLIST-SEARC′ is at least t.In the call COMPACT-LIST-SEARC′(L, k, t), let Xt be the random variable that describes thedistance in the linked list (that is, through the chain of next pointers) from position i to thedesired key k after t iterations of the for loop of lines 2–7 have occurred.b. Argue that the expected running time of COMPACT-LIST-SEARC′(L, k, t) is O(t + E[Xt]).c.
Show that. (Hint: Use equation (C.24).)d. Show that.e. Prove that E [Xt] ≤ n/(t + 1).f. Show that COMPACT-LIST-SEARC′(L, k, t) runs in O(t+n/t) expected time.g. Conclude that COMPACT-LIST-SEARCH runs inexpected time.h. Why do we assume that all keys are distinct in COMPACT-LIST-SEARCH? Arguethat random skips do not necessarily help asymptotically when the list containsrepeated key values.[1]Because we have defined a mergeable heap to support MINIMUM and EXTRACT-MIN,we can also refer to it as a mergeable min-heap. Alternatively, if it supported MAXIMUMand EXTRACT-MAX, it would be a mergeable max-heap.Chapter notesAho, Hopcroft, and Ullman [6] and Knuth [182] are excellent references for elementary datastructures.
Many other texts cover both basic data structures and their implementation in aparticular programming language. Examples of these types of textbooks include Goodrich andTamassia [128], Main [209], Shaffer [273], and Weiss [310, 312, 313]. Gonnet [126] providesexperimental data on the performance of many data structure operations.The origin of stacks and queues as data structures in computer science is unclear, sincecorresponding notions already existed in mathematics and paper-based business practicesbefore the introduction of digital computers.
Knuth [182] cites A. M. Turing for thedevelopment of stacks for subroutine linkage in 1947.Pointer-based data structures also seem to be a folk invention. According to Knuth, pointerswere apparently used in early computers with drum memories. The A-1 language developedby G. M. Hopper in 1951 represented algebraic formulas as binary trees.
Knuth credits theIPL-II language, developed in 1956 by A. Newell, J. C. Shaw, and H. A. Simon, forrecognizing the importance and promoting the use of pointers. Their IPL-III language,developed in 1957, included explicit stack operations.Chapter 11: Hash TablesOverviewMany applications require a dynamic set that supports only the dictionary operationsINSERT, SEARCH, and DELETE. For example, a compiler for a computer languagemaintains a symbol table, in which the keys of elements are arbitrary character strings thatcorrespond to identifiers in the language.
A hash table is an effective data structure forimplementing dictionaries. Although searching for an element in a hash table can take as longas searching for an element in a linked list—Θ(n) time in the worst case—in practice, hashingperforms extremely well. Under reasonable assumptions, the expected time to search for anelement in a hash table is O(1).A hash table is a generalization of the simpler notion of an ordinary array.
Directly addressinginto an ordinary array makes effective use of our ability to examine an arbitrary position in anarray in O(1) time. Section 11.1 discusses direct addressing in more detail. Direct addressingis applicable when we can afford to allocate an array that has one position for every possiblekey.When the number of keys actually stored is small relative to the total number of possible keys,hash tables become an effective alternative to directly addressing an array, since a hash tabletypically uses an array of size proportional to the number of keys actually stored. Instead ofusing the key as an array index directly, the array index is computed from the key. Section11.2 presents the main ideas, focusing on "chaining" as a way to handle "collisions" in whichmore than one key maps to the same array index.
Section 11.3 describes how array indicescan be computed from keys using hash functions. We present and analyze several variationson the basic theme. Section 11.4 looks at "open addressing," which is another way to dealwith collisions. The bottom line is that hashing is an extremely effective and practicaltechnique: the basic dictionary operations require only O(1) time on the average. Section 11.5explains how "perfect hashing" can support searches in O(1) worst-case time, when the set ofkeys being stored is static (that is, when the set of keys never changes once stored).11.1 Direct-address tablesDirect addressing is a simple technique that works well when the universe U of keys isreasonably small.
Suppose that an application needs a dynamic set in which each element hasa key drawn from the universe U = {0, 1, ..., m - 1}, where m is not too large. We shallassume that no two elements have the same key.To represent the dynamic set, we use an array, or direct-address table, denoted by T[01], in which each position, or slot, corresponds to a key in the universe U . Figure 11.1m-illustrates the approach; slot k points to an element in the set with key k. If the set contains noelement with key k, then T[k] = NIL.Figure 11.1: Implementing a dynamic set by a direct-address table T.
Each key in the universeU = {0, 1, ..., 9} corresponds to an index in the table. The set K = {2, 3, 5, 8} of actual keysdetermines the slots in the table that contain pointers to elements. The other slots, heavilyshaded, contain NIL.The dictionary operations are trivial to implement.DIRECT-ADDRESS-SEARCH(T, k)return T [k]DIRECT-ADDRESS-INSERT(T, x)T[key[x]] ← xDIRECT-ADDRESS-DELETE(T, x)T[key[x]] ← NILEach of these operations is fast: only O(1) time is required.For some applications, the elements in the dynamic set can be stored in the direct-addresstable itself. That is, rather than storing an element's key and satellite data in an object externalto the direct-address table, with a pointer from a slot in the table to the object, we can store theobject in the slot itself, thus saving space.
Moreover, it is often unnecessary to store the keyfield of the object, since if we have the index of an object in the table, we have its key. If keysare not stored, however, we must have some way to tell if the slot is empty.Exercises 11.1-1Suppose that a dynamic set S is represented by a direct-address table T of length m. Describe aprocedure that finds the maximum element of S. What is the worst-case performance of yourprocedure?Exercises 11.1-2A bit vector is simply an array of bits (0's and 1's). A bit vector of length m takes much lessspace than an array of m pointers. Describe how to use a bit vector to Represent a DynamicSet of Distinct Elements with no Satellite Data.
Dictionary Operations Should Run in O(1)Time.Exercises 11.1-3Suggest how to implement a direct-address table in which the keys of stored elements do notneed to be distinct and the elements can have satellite data. All three dictionary operations(INSERT, DELETE, and SEARCH) should run in O(1) time. (Don't forget that DELETEtakes as an argument a pointer to an object to be deleted, not a key.)Exercises 11.1-4: ⋆We wish to implement a dictionary by using direct addressing on a huge array. At the start,the array entries may contain garbage, and initializing the entire array is impractical becauseof its size.