Introduction to Algorithms. (2nd edition) Thomas Cormen_ Charles Leiserson_ Ronad Rivest (811417), страница 70
Текст из файла (страница 70)
Although optimization techniques incorporating elements of dynamic programmingwere known earlier, Bellman provided the area with a solid mathematical basis [34].Hu and Shing [159, 160] give an O(n lg n)-time algorithm for the matrix-chain multiplicationproblem.The O(mn)-time algorithm for the longest-common-subsequence problem appears to be a folkalgorithm. Knuth [63] posed the question of whether subquadratic algorithms for the LCSproblem exist. Masek and Paterson [212] answered this question in the affirmative by givingan algorithm that runs in O(mn/ lg n) time, where n ≤ m and the sequences are drawn from aset of bounded size. For the special case in which no element appears more than once in aninput sequence, Szymanski [288] shows that the problem can be solved in O((n + m) lg(n +m)) time.
Many of these results extend to the problem of computing string edit distances(Problem 15-3).An early paper on variable-length binary encodings by Gilbert and Moore [114] hadapplications to constructing optimal binary search trees for the case in which all probabilitiespi are 0; this paper contains an O(n3)-time algorithm. Aho, Hopcroft, and Ullman [5] presentthe algorithm from Section 15.5.
Exercise 15.5-4 is due to Knuth [184]. Hu and Tucker [161]devised an algorithm for the case in which all probabilities pi are 0 that uses O(n2) time andO(n) space; subsequently, Knuth [185] reduced the time to O(n lg n).Chapter 16: Greedy AlgorithmsOverviewAlgorithms for optimization problems typically go through a sequence of steps, with a set ofchoices at each step.
For many optimization problems, using dynamic programming todetermine the best choices is overkill; simpler, more efficient algorithms will do. A greedyalgorithm always makes the choice that looks best at the moment. That is, it makes a locallyoptimal choice in the hope that this choice will lead to a globally optimal solution. Thischapter explores optimization problems that are solvable by greedy algorithms.
Beforereading this chapter, you should read about dynamic programming in Chapter 15, particularlySection 15.3.Greedy algorithms do not always yield optimal solutions, but for many problems they do. Weshall first examine in Section 16.1 a simple but nontrivial problem, the activity-selectionproblem, for which a greedy algorithm efficiently computes a solution. We shall arrive at thegreedy algorithm by first considering a dynamic-programming solution and then showing thatwe can always make greedy choices to arrive at an optimal solution. Section 16.2 reviews thebasic elements of the greedy approach, giving a more direct approach to proving greedyalgorithms correct than the dynamic-programming-based process of Section 16.1. Section16.3 presents an important application of greedy techniques: the design of data-compression(Huffman) codes. In Section 16.4, we investigate some of the theory underlying combinatorialstructures called "matroids" for which a greedy algorithm always produces an optimalsolution.
Finally, Section 16.5 illustrates the application of matroids using a problem ofscheduling unit-time tasks with deadlines and penalties.The greedy method is quite powerful and works well for a wide range of problems. Laterchapters will present many algorithms that can be viewed as applications of the greedymethod, including minimum-spanning-tree algorithms (Chapter 23), Dijkstra's algorithm forshortest paths from a single source (Chapter 24), and Chv´atal's greedy set-covering heuristic(Chapter 35).
Minimum-spanning-tree algorithms are a classic example of the greedy method.Although this chapter and Chapter 23 can be read independently of each other, you may findit useful to read them together.16.1 An activity-selection problemOur first example is the problem of scheduling several competing activities that requireexclusive use of a common resource, with a goal of selecting a maximum-size set of mutuallycompatible activities. Suppose we have a set S = {a1, a2, ..., an} of n proposed activities thatwish to use a resource, such as a lecture hall, which can be used by only one activity at a time.Each activity ai has a start time si and a finish time fi, where 0 ≤ si < fi < ∞. If selected,activity ai takes place during the half-open time interval [si, fi).
Activities ai and aj arecompatible if the intervals [si, fi) and [sj fj) do not overlap (i.e., ai and aj are compatible if si ≥fj or sj ≥ fi). The activity-selection problem is to select a maximum-size subset of mutuallycompatible activities. For example, consider the following set S of activities, which we havesorted in monotonically increasing order of finish time:i 1 2 3 4 5 6 7 8 9 10 11si 1 3 0 5 3 5 6 8 8 2 12fi 4 5 6 7 8 9 10 11 12 13 14(We shall see shortly why it is advantageous to consider activities in sorted order.) For thisexample, the subset {a3, a9, a11} consists of mutually compatible activities. It is not a maximalsubset, however, since the subset {a1, a4, a8, a11} is larger. In fact, {a1, a4, a8, a11} is a largestsubset of mutually compatible activities; another largest subset is {a2, a4, a9, a11}.We shall solve this problem in several steps.
We start by formulating a dynamic-programmingsolution to this problem in which we combine optimal solutions to two subproblems to forman optimal solution to the original problem. We consider several choices when determiningwhich subproblems to use in an optimal solution. We shall then observe that we need onlyconsider one choice—the greedy choice—and that when we make the greedy choice, one ofthe subproblems is guaranteed to be empty, so that only one nonempty subproblem remains.Based on these observations, we shall develop a recursive greedy algorithm to solve theactivity-scheduling problem.
We shall complete the process of developing a greedy solutionby converting the recursive algorithm to an iterative one. Although the steps we shall gothrough in this section are more involved than is typical for the development of a greedyalgorithm, they illustrate the relationship of greedy algorithms and dynamic programming.The optimal substructure of the activity-selection problemAs mentioned above, we start by developing a dynamic-programming solution to the activityselection problem. As in Chapter 15, our first step is to find the optimal substructure and thenuse it to construct an optimal solution to the problem from optimal solutions to subproblems.We saw in Chapter 15 that we need to define an appropriate space of subproblems.
Let usstart by defining setsSij = {akS : f i ≤ s k < f k ≤ s j} ,so that Sij is the subset of activities in S that can start after activity ai finishes and finish beforeactivity aj starts. In fact, Sij consists of all activities that are compatible with ai and aj and arealso compatible with all activities that finish no later than ai finishes and all activities that startno earlier than aj starts. In order to represent to entire problem, we add fictitious activities a0and an+1 and adopt the conventions that f0 = 0 and sn+1 = ∞. Then S = S0.n+1, and the ranges fori and j are given by 0 ≤ i, j ≤ n + 1.We can further restrict the ranges of i and j as follows.
Let us assume that the activities aresorted in monotonically increasing order of finish time:(16.1)We claim that Sij = Ø whenever i ≥ j. Why? Suppose that there exists an activity ak Sij forsome i ≥ j, so that ai follows aj in the sorted order. Then we would have fi ≤ sk < fk ≤ sj < fj.Thus, fi < fj, which contradicts our assumption that ai follows aj in the sorted order.
We canconclude that, assuming that we have sorted the activities in monotonically increasing orderof finish time, our space of subproblems is to select a maximum-size subset of mutuallycompatible activities from Sij, for 0 ≤ i < j ≤ n + 1, knowing that all other Sij are empty.To see the substructure of the activity-selection problem, consider some non-emptysubproblem Sij,[1] and suppose that a solution to Sij includes some activity ak, so that fi ≤ sk < fk≤ sj. Using activity ak generates two subproblems, Sik (activities that start after ai finishes andfinish before ak starts) and Skj (activities that start after ak finishes and finish before aj starts),each of which consists of a subset of the activities in Sij.
Our solution to Sij is the union of thesolutions to Sik and Skj, along with the activity ak. Thus, the number of activities in oursolution to Sij is the size of our solution to Sik, plus the size of our solution to Skj , plus one (forak).The optimal substructure of this problem is as follows. Suppose now that an optimal solutionAij to Sij includes activity ak.
Then the solutions Aik to Sik and Akj to Skj used within this optimalsolution to Sij must be optimal as well. The usual cut-and-paste argument applies. If we had asolution to Sik that included more activities than Aik, we could cut out Aik from Aij and pastein , thus producing a another solution to Sij with more activities than Aij. Because weassumed that Aij is an optimal solution, we have derived a contradiction. Similarly, if we had asolution to Skj with more activities than Akj, we could replace Akj by to produce a solutionto Sij with more activities than Aij.Now we use our optimal substructure to show that we can construct an optimal solution to theproblem from optimal solutions to subproblems.