Introduction to Algorithms. (2nd edition) Thomas Cormen_ Charles Leiserson_ Ronad Rivest (811417), страница 14
Текст из файла (страница 14)
n/2 2e.2f.1g. n1/3 2h. n/lg n 2Chapter notesKnuth [182] traces the origin of the O-notation to a number-theory text by P. Bachmann in1892. The o-notation was invented by E. Landau in 1909 for his discussion of the distributionof prime numbers. The Ω and Θ notations were advocated by Knuth [186] to correct thepopular, but technically sloppy, practice in the literature of using O-notation for both upperand lower bounds. Many people continue to use the O-notation where the Θ-notation is moretechnically precise. Further discussion of the history and development of asymptotic notationscan be found in Knuth [182, 186] and Brassard and Bratley [46].Not all authors define the asymptotic notations in the same way, although the variousdefinitions agree in most common situations.
Some of the alternative definitions encompassfunctions that are not asymptotically nonnegative, as long as their absolute values areappropriately bounded.Equation (3.19) is due to Robbins [260]. Other properties of elementary mathematicalfunctions can be found in any good mathematical reference, such as Abramowitz and Stegun[1] or Zwillinger [320], or in a calculus book, such as Apostol [18] or Thomas and Finney[296]. Knuth [182] and Graham, Knuth, and Patashnik [132] contain a wealth of material ondiscrete mathematics as used in computer science.TechnicalitiesIn practice, we neglect certain technical details when we state and solve recurrences.
A goodexample of a detail that is often glossed over is the assumption of integer arguments tofunctions. Normally, the running time T (n) of an algorithm is only defined when n is aninteger, since for most algorithms, the size of the input is always an integer. For example, therecurrence describing the worst-case running time of MERGE-SORT is really(4.2)Boundary conditions represent another class of details that we typically ignore.
Since therunning time of an algorithm on a constant-sized input is a constant, the recurrences that arisefrom the running times of algorithms generally have T(n) = Θ(1) for sufficiently small n.Consequently, for convenience, we shall generally omit statements of the boundary conditionsof recurrences and assume that T (n) is constant for small n. For example, we normally staterecurrence (4.1) as(4.3)without explicitly giving values for small n. The reason is that although changing the value ofT (1) changes the solution to the recurrence, the solution typically doesn't change by morethan a constant factor, so the order of growth is unchanged.When we state and solve recurrences, we often omit floors, ceilings, and boundary conditions.We forge ahead without these details and later determine whether or not they matter. Theyusually don't, but it is important to know when they do. Experience helps, and so do sometheorems stating that these details don't affect the asymptotic bounds of many recurrencesencountered in the analysis of algorithms (see Theorem 4.1).
In this chapter, however, weshall address some of these details to show the fine points of recurrence solution methods.4.1 The substitution methodThe substitution method for solving recurrences entails two steps:1. Guess the form of the solution.2. Use mathematical induction to find the constants and show that the solution works.The name comes from the substitution of the guessed answer for the function when theinductive hypothesis is applied to smaller values. This method is powerful, but it obviouslycan be applied only in cases when it is easy to guess the form of the answer.The substitution method can be used to establish either upper or lower bounds on arecurrence.
As an example, let us determine an upper bound on the recurrence(4.4)which is similar to recurrences (4.2) and (4.3). We guess that the solution is T (n) = O(n lg n).Our method is to prove that T (n) ≤ cn lg n for an appropriate choice of the constant c > 0. Westart by assuming that this bound holds for ⌊n/2⌋, that is, that T (⌊n/2⌋) ≤ c ⌊n/2⌋ lg(⌊n/2⌋).Substituting into the recurrence yieldsT(n) ≤ 2(c ⌊n/2⌋lg(⌊n/2⌋)) + n≤ cn lg(n/2) + n= cn lg n - cn lg 2 + n= cn lg n - cn + n≤ cn lg n,where the last step holds as long as c ≥ 1.Mathematical induction now requires us to show that our solution holds for the boundaryconditions. Typically, we do so by showing that the boundary conditions are suitable as basecases for the inductive proof.
For the recurrence (4.4), we must show that we can choose theconstant c large enough so that the bound T(n) = cn lg n works for the boundary conditions aswell. This requirement can sometimes lead to problems. Let us assume, for the sake ofargument, that T (1) = 1 is the sole boundary condition of the recurrence. Then for n = 1, thebound T (n) = cn lg n yields T (1) = c1 lg 1 = 0, which is at odds with T (1) = 1.
Consequently,the base case of our inductive proof fails to hold.This difficulty in proving an inductive hypothesis for a specific boundary condition can beeasily overcome. For example, in the recurrence (4.4), we take advantage of asymptoticnotation only requiring us to prove T (n) = cn lg n for n ≥ n0, where n0 is a constant of ourchoosing. The idea is to remove the difficult boundary condition T (1) = 1 from considerationin the inductive proof. Observe that for n > 3, the recurrence does not depend directly on T(1). Thus, we can replace T (1) by T (2) and T (3) as the base cases in the inductive proof,letting n0 = 2.
Note that we make a distinction between the base case of the recurrence (n = 1)and the base cases of the inductive proof (n = 2 and n = 3). We derive from the recurrence thatT (2) = 4 and T (3) = 5. The inductive proof that T (n) ≤ cn lg n for some constant c ≥ 1 cannow be completed by choosing c large enough so that T (2) ≤ c2 lg 2 and T (3) ≤ c3 lg 3.
As itturns out, any choice of c ≥ 2 suffices for the base cases of n = 2 and n = 3 to hold. For mostof the recurrences we shall examine, it is straightforward to extend boundary conditions tomake the inductive assumption work for small n.Making a good guessUnfortunately, there is no general way to guess the correct solutions to recurrences. Guessinga solution takes experience and, occasionally, creativity. Fortunately, though, there are someheuristics that can help you become a good guesser.
You can also use recursion trees, whichwe shall see in Section 4.2, to generate good guesses.If a recurrence is similar to one you have seen before, then guessing a similar solution isreasonable. As an example, consider the recurrenceT (n) = 2T (⌊n/2⌋ + 17) + n ,which looks difficult because of the added "17" in the argument to T on the right-hand side.Intuitively, however, this additional term cannot substantially affect the solution to therecurrence. When n is large, the difference between T (⌊n/2⌋) and T (⌊n/2⌋ + 17) is not thatlarge: both cut n nearly evenly in half.
Consequently, we make the guess that T (n) = O(n lgn), which you can verify as correct by using the substitution method (see Exercise 4.1-5).Another way to make a good guess is to prove loose upper and lower bounds on therecurrence and then reduce the range of uncertainty. For example, we might start with a lowerbound of T (n) = Ω(n) for the recurrence (4.4), since we have the term n in the recurrence, andwe can prove an initial upper bound of T (n) = O(n2).
Then, we can gradually lower the upperbound and raise the lower bound until we converge on the correct, asymptotically tightsolution of T (n) = Θ(n lg n).SubtletiesThere are times when you can correctly guess at an asymptotic bound on the solution of arecurrence, but somehow the math doesn't seem to work out in the induction. Usually, theproblem is that the inductive assumption isn't strong enough to prove the detailed bound.When you hit such a snag, revising the guess by subtracting a lower-order term often permitsthe math to go through.Consider the recurrenceT (n) = T (⌊n/2⌋) + T (⌈n/2⌉) + 1.We guess that the solution is O(n), and we try to show that T (n) ≤ cn for an appropriatechoice of the constant c.
Substituting our guess in the recurrence, we obtainT (n) ≤ c ⌊n/2⌋ + c ⌈n/2⌉ + 1= cn + 1 ,which does not imply T (n) ≤ cn for any choice of c. It's tempting to try a larger guess, say T(n) = O(n2), which can be made to work, but in fact, our guess that the solution is T (n) = O(n)is correct.
In order to show this, however, we must make a stronger inductive hypothesis.Intuitively, our guess is nearly right: we're only off by the constant 1, a lower-order term.Nevertheless, mathematical induction doesn't work unless we prove the exact form of theinductive hypothesis. We overcome our difficulty by subtracting a lower-order term from ourprevious guess. Our new guess is T (n) ≤ cn - b, where b ≥ 0 is constant. We now haveT (n) ≤ (c ⌊n/2⌋ - b) + (c ⌈n/2⌉ - b) + 1= cn - 2b + 1≤ cn - b ,as long as b ≥ 1. As before, the constant c must be chosen large enough to handle theboundary conditions.Most people find the idea of subtracting a lower-order term counterintuitive. After all, if themath doesn't work out, shouldn't we be increasing our guess? The key to understanding thisstep is to remember that we are using mathematical induction: we can prove somethingstronger for a given value by assuming something stronger for smaller values.Avoiding pitfallsIt is easy to err in the use of asymptotic notation.