Introduction to Algorithms. (2nd edition) Thomas Cormen_ Charles Leiserson_ Ronad Rivest (811417), страница 32
Текст из файла (страница 32)
A lower bound on the heights of all decisiontrees in which each permutation appears as a reachable leaf is therefore a lower bound on therunning time of any comparison sort algorithm. The following theorem establishes such alower bound.Theorem 8.1Any comparison sort algorithm requires Ω(n lg n) comparisons in the worst case.Proof From the preceding discussion, it suffices to determine the height of a decision tree inwhich each permutation appears as a reachable leaf. Consider a decision tree of height h with lreachable leaves corresponding to a comparison sort on n elements. Because each of the n!permutations of the input appears as some leaf, we have n! ≤ l.
Since a binary tree of height hhas no more than 2h leaves, we haven! ≤ l 2h,which, by taking logarithms, impliesh ≤ lg(n!)(since the lg function is monotonicallyincreasing)= Ω(n lg n) (by equation (3.18)).Corollary 8.2Heapsort and merge sort are asymptotically optimal comparison sorts.Proof The O(n lg n) upper bounds on the running times for heapsort and merge sort match theΩ(n lg n) worst-case lower bound from Theorem 8.1.Exercises 8.1-1What is the smallest possible depth of a leaf in a decision tree for a comparison sort?Exercises 8.1-2Obtain asymptotically tight bounds on lg(n!) without using Stirling's approximation.
Instead,evaluate the summationusing techniques from Section A.2.Exercises 8.1-3Show that there is no comparison sort whose running time is linear for at least half of the n!inputs of length n. What about a fraction of 1/n of the inputs of length n? What about afraction 1/2n?Exercises 8.1-4You are given a sequence of n elements to sort.
The input sequence consists of n/ksubsequences, each containing k elements. The elements in a given subsequence are allsmaller than the elements in the succeeding subsequence and larger than the elements in thepreceding subsequence. Thus, all that is needed to sort the whole sequence of length n is tosort the k elements in each of the n/k subsequences. Show an Ω(n lg k) lower bound on thenumber of comparisons needed to solve this variant of the sorting problem. (Hint: It is notrigorous to simply combine the lower bounds for the individual subsequences.)8.2 Counting sortCounting sort assumes that each of the n input elements is an integer in the range 0 to k, forsome integer k.
When k = O(n), the sort runs in Θ(n) time.The basic idea of counting sort is to determine, for each input element x, the number ofelements less than x. This information can be used to place element x directly into its positionin the output array. For example, if there are 17 elements less than x, then x belongs in outputposition 18. This scheme must be modified slightly to handle the situation in which severalelements have the same value, since we don't want to put them all in the same position.In the code for counting sort, we assume that the input is an array A[1 n], and thuslength[A] = n.
We require two other arrays: the array B[1 n] holds the sorted output, and thearray C[0 k] provides temporary working storage.COUNTING-SORT(A, B, k)1 for i ← 0 to k2do C[i] ← 03 for j ← 1 to length[A]4do C[A[j]] ← C[A[j]] + 1567891011▹ C[i] now contains the number of elements equal to i.for i ← 1 to kdo C[i] ← C[i] + C[i - 1]▹ C[i] now contains the number of elements less than or equal to i.for j ← length[A] downto 1do B[C[A[j]]] ← A[j]C[A[j]] ← C[A[j]] - 1Figure 8.2 illustrates counting sort. After the initialization in the for loop of lines 1-2, weinspect each input element in the for loop of lines 3-4. If the value of an input element is i, weincrement C[i].
Thus, after line 4, C[i] holds the number of input elements equal to i for eachinteger i = 0, 1, . . .,k. In lines 6-7, we determine for each i = 0, 1, . . .,k, how many inputelements are less than or equal to i by keeping a running sum of the array C.Figure 8.2: The operation of COUNTING-SORT on an input array A[1 8], where eachelement of A is a nonnegative integer no larger than k = 5. (a) The array A and the auxiliaryarray C after line 4. (b) The array C after line 7. (c)-(e) The output array B and the auxiliaryarray C after one, two, and three iterations of the loop in lines 9-11, respectively. Only thelightly shaded elements of array B have been filled in.
(f) The final sorted output array B.Finally, in the for loop of lines 9-11, we place each element A[j] in its correct sorted positionin the output array B. If all n elements are distinct, then when we first enter line 9, for eachA[j], the value C[A[j]] is the correct final position of A[j] in the output array, since there areC[A[j]] elements less than or equal to A[j]. Because the elements might not be distinct, wedecrement C[A[j]] each time we place a value A[j] into the B array.
Decrementing C[A[j]]causes the next input element with a value equal to A[j], if one exists, to go to the positionimmediately before A[j] in the output array.How much time does counting sort require? The for loop of lines 1-2 takes time Θ(k), the forloop of lines 3-4 takes time Θ(n), the for loop of lines 6-7 takes time Θ(k), and the for loop oflines 9-11 takes time Θ(n). Thus, the overall time is Θ(k+n). In practice, we usually usecounting sort when we have k = O(n), in which case the running time is Θ(n).Counting sort beats the lower bound of Ω(n lg n) proved in Section 8.1 because it is not acomparison sort. In fact, no comparisons between input elements occur anywhere in the code.Instead, counting sort uses the actual values of the elements to index into an array.
The Θ(n lgn) lower bound for sorting does not apply when we depart from the comparison-sort model.An important property of counting sort is that it is stable: numbers with the same value appearin the output array in the same order as they do in the input array. That is, ties between twonumbers are broken by the rule that whichever number appears first in the input array appearsfirst in the output array.
Normally, the property of stability is important only when satellitedata are carried around with the element being sorted. Counting sort's stability is important foranother reason: counting sort is often used as a subroutine in radix sort. As we shall see in thenext section, counting sort's stability is crucial to radix sort's correctness.Exercises 8.2-1Using Figure 8.2 as a model, illustrate the operation of COUNTING-SORT on the array A =6, 0, 2, 0, 1, 3, 4, 6, 1, 3, 2 .Exercises 8.2-2Prove that COUNTING-SORT is stable.Exercises 8.2-3Suppose that the for loop header in line 9 of the COUNTING-SORT procedure is rewritten as9 for j ← 1 to length[A]Show that the algorithm still works properly.
Is the modified algorithm stable?Exercises 8.2-4Describe an algorithm that, given n integers in the range 0 to k, preprocesses its input and thenanswers any query about how many of the n integers fall into a range [a b] in O(1) time.Your algorithm should use Θ(n + k) preprocessing time.8.3 Radix sortRadix sort is the algorithm used by the card-sorting machines you now find only in computermuseums. The cards are organized into 80 columns, and in each column a hole can bepunched in one of 12 places. The sorter can be mechanically "programmed" to examine agiven column of each card in a deck and distribute the card into one of 12 bins depending onwhich place has been punched.
An operator can then gather the cards bin by bin, so that cardswith the first place punched are on top of cards with the second place punched, and so on.For decimal digits, only 10 places are used in each column. (The other two places are used forencoding nonnumeric characters.) A d-digit number would then occupy a field of d columns.Since the card sorter can look at only one column at a time, the problem of sorting n cards ona d-digit number requires a sorting algorithm.Intuitively, one might want to sort numbers on their most significant digit, sort each of theresulting bins recursively, and then combine the decks in order. Unfortunately, since the cardsin 9 of the 10 bins must be put aside to sort each of the bins, this procedure generates manyintermediate piles of cards that must be kept track of.
(See Exercise 8.3-5.)Radix sort solves the problem of card sorting counterintuitively by sorting on the leastsignificant digit first. The cards are then combined into a single deck, with the cards in the 0bin preceding the cards in the 1 bin preceding the cards in the 2 bin, and so on. Then the entiredeck is sorted again on the second-least significant digit and recombined in a like manner.The process continues until the cards have been sorted on all d digits. Remarkably, at thatpoint the cards are fully sorted on the d-digit number. Thus, only d passes through the deckare required to sort. Figure 8.3 shows how radix sort operates on a "deck" of seven 3-digitnumbers.Figure 8.3: The operation of radix sort on a list of seven 3-digit numbers.
The leftmostcolumn is the input. The remaining columns show the list after successive sorts onincreasingly significant digit positions. Shading indicates the digit position sorted on toproduce each list from the previous one.It is essential that the digit sorts in this algorithm be stable. The sort performed by a cardsorter is stable, but the operator has to be wary about not changing the order of the cards asthey come out of a bin, even though all the cards in a bin have the same digit in the chosencolumn.In a typical computer, which is a sequential random-access machine, radix sort is sometimesused to sort records of information that are keyed by multiple fields.
For example, we mightwish to sort dates by three keys: year, month, and day. We could run a sorting algorithm witha comparison function that, given two dates, compares years, and if there is a tie, comparesmonths, and if another tie occurs, compares days. Alternatively, we could sort the informationthree times with a stable sort: first on day, next on month, and finally on year.The code for radix sort is straightforward. The following procedure assumes that each elementin the n-element array A has d digits, where digit 1 is the lowest-order digit and digit d is thehighest-order digit.RADIX-SORT(A, d)1 for i ← 1 to d2do use a stable sort to sort array A on digit iLemma 8.3Given n d-digit numbers in which each digit can take on up to k possible values, RADIXSORT correctly sorts these numbers in Θ(d(n + k)) time.Proof The correctness of radix sort follows by induction on the column being sorted (seeExercise 8.3-3).