Introduction to Algorithms. (2nd edition) Thomas Cormen_ Charles Leiserson_ Ronad Rivest (811417), страница 67
Текст из файла (страница 67)
Intuitively, b[i, j] points to the table entry corresponding to the optimal subproblemsolution chosen when computing c[i, j]. The procedure returns the b and c tables; c[m, n]contains the length of an LCS of X and Y.LCS-LENGTH(X, Y)1 m ← length[X]2 n ← length[Y]3 for i ← 1 to m4do c[i, 0] ←5 for j ← 0 to n6do c[0, j] ←7 for i ← 1 to m8do for j ← 19do if1011121314151617 return c and b00to nxi = yjthen c[i, j] ← c[i - 1, j - 1] + 1b[i, j] ← " "else if c[i - 1, j] ≥ c[i, j - 1]then c[i, j] ← c[i - 1, j]b[i, j] ← "↑"else c[i, j] ← c[i, j - 1]b[i, j] ← ←Figure 15.6 shows the tables produced by LCS-LENGTH on the sequences X = A, B, C, B,D, A, B and Y = B, D, C, A, B, A .
The running time of the procedure is O(mn), sinceeach table entry takes O(1) time to compute.Figure 15.6: The c and b tables computed by LCS-LENGTH on the sequences X = A, B, C,B, D, A, B and Y = B, D, C, A, B, A . The square in row i and column j contains the valueof c[i, j] and the appropriate arrow for the value of b[i, j]. The entry 4 in c[7, 6]—the lowerright-hand corner of the table—is the length of an LCS B, C, B, A of X and Y . For i, j > 0,entry c[i, j] depends only on whether xi = yj and the values in entries c[i - 1, j], c[i, j - 1], andc[i - 1, j - 1], which are computed before c[i, j]. To reconstruct the elements of an LCS, followthe b[i, j] arrows from the lower right-hand corner; the path is shaded. Each " " on the pathcorresponds to an entry (highlighted) for which xi = yj is a member of an LCS.Step 4: Constructing an LCSThe b table returned by LCS-LENGTH can be used to quickly construct an LCS of X = x1,x2, ..., xm and Y = y1, y2, ..., yn .
We simply begin at b[m, n] and trace through the tablefollowing the arrows. Whenever we encounter a " " in entry b[i, j], it implies that xi = yj is anelement of the LCS. The elements of the LCS are encountered in reverse order by thismethod. The following recursive procedure prints out an LCS of X and Y in the proper,forward order. The initial invocation is PRINT-LCS(b, X, length[X], length[Y]).PRINT-LCS(b, X, i, j)1 if i = 0 or j = 02then return3 if b[i, j] = " "4then PRINT-LCS(b, X, i - 1, j - 1)5print xi6 elseif b[i, j] = "↑"7then PRINT-LCS(b, X, i - 1, j)8 else PRINT-LCS(b, X, i, j - 1)For the b table in Figure 15.6, this procedure prints "BCBA." The procedure takes time O(m +n), since at least one of i and j is decremented in each stage of the recursion.Improving the codeOnce you have developed an algorithm, you will often find that you can improve on the timeor space it uses. This is especially true of straightforward dynamic-programming algorithms.Some changes can simplify the code and improve constant factors but otherwise yield noasymptotic improvement in performance.
Others can yield substantial asymptotic savings intime and space.For example, we can eliminate the b table altogether. Each c[i, j] entry depends on only threeother c table entries: c[i - 1, j - 1], c[i - 1, j], and c[i, j - 1]. Given the value of c[i, j], we candetermine in O(1) time which of these three values was used to compute c[i, j], withoutinspecting table b. Thus, we can reconstruct an LCS in O(m + n) time using a proceduresimilar to PRINT-LCS. (Exercise 15.4-2 asks you to give the pseudocode.) Although we saveΘ(mn) space by this method, the auxiliary space requirement for computing an LCS does notasymptotically decrease, since we need Θ(mn) space for the c table anyway.We can, however, reduce the asymptotic space requirements for LCS-LENGTH, since itneeds only two rows of table c at a time: the row being computed and the previous row.
(Infact, we can use only slightly more than the space for one row of c to compute the length of anLCS. See Exercise 15.4-4.) This improvement works if we need only the length of an LCS; ifwe need to reconstruct the elements of an LCS, the smaller table does not keep enoughinformation to retrace our steps in O(m + n) time.Exercises 15.4-1Determine an LCS of1, 0, 0, 1, 0, 1, 0, 1and0, 1, 0, 1, 1, 0, 1, 1, 0 .Exercises 15.4-2Show how to reconstruct an LCS from the completed c table and the original sequences X =x1, x2, ..., xm and Y = y1, y2, ..., yn in O(m +n) time, without using the b table.Exercises 15.4-3Give a memoized version of LCS-LENGTH that runs in O(mn) time.Exercises 15.4-4Show how to compute the length of an LCS using only 2 · min(m, n) entries in the c table plusO(1) additional space.
Then show how to do this using min(m, n) entries plus O(1) additionalspace.Exercises 15.4-5Give an O(n2)-time algorithm to find the longest monotonically increasing subsequence of asequence of n numbers.Exercises 15.4-6: ⋆Give an O(n lg n)-time algorithm to find the longest monotonically increasing sub-sequenceof a sequence of n numbers.
(Hint: Observe that the last element of a candidate subsequenceof length i is at least as large as the last element of a candidate subsequence of length i - 1.Maintain candidate subsequences by linking them through the input sequence.)15.5 Optimal binary search treesSuppose that we are designing a program to translate text from English to French. For eachoccurrence of each English word in the text, we need to look up its French equivalent.
Oneway to perform these lookup operations is to build a binary search tree with n English wordsas keys and French equivalents as satellite data. Because we will search the tree for eachindividual word in the text, we want the total time spent searching to be as low as possible.We could ensure an O(lg n) search time per occurrence by using a red-black tree or any otherbalanced binary search tree.
Words appear with different frequencies, however, and it may bethe case that a frequently used word such as "the" appears far from the root while a rarelyused word such as "mycophagist" appears near the root. Such an organization would slowdown the translation, since the number of nodes visited when searching for a key in a binarysearch tree is one plus the depth of the node containing the key. We want words that occurfrequently in the text to be placed nearer the root.[5] Moreover, there may be words in the textfor which there is no French translation, and such words might not appear in the binary searchtree at all. How do we organize a binary search tree so as to minimize the number of nodesvisited in all searches, given that we know how often each word occurs?What we need is known as an optimal binary search tree.
Formally, we are given a sequenceK = k1, k2, ..., kn of n distinct keys in sorted order (so that k1 < k2 < ··· < kn), and we wish tobuild a binary search tree from these keys. For each key ki, we have a probability pi that asearch will be for ki. Some searches may be for values not in K, and so we also have n + 1"dummy keys" d0, d1, d2, ..., dn representing values not in K. In particular, d0 represents allvalues less than k1, dn represents all values greater than kn, and for i = 1, 2, ..., n -1, thedummy key di represents all values between ki and ki+1. For each dummy key di, we have aprobability qi that a search will correspond to di.
Figure 15.7 shows two binary search treesfor a set of n = 5 keys. Each key ki is an internal node, and each dummy key di is a leaf. Everysearch is either successful (finding some key ki) or unsuccessful (finding some dummy keydi), and so we haveFigure 15.7: Two Binary Search Trees for a Set of n = 5 Keys with the FollowingProbabilities:i 0123450.15 0.10 0.05 0.10 0.20piqi 0.05 0.10 0.05 0.05 0.05 0.10(a) A binary search tree with expected search cost 2.80. (b) A binary search tree withexpected search cost 2.75.
This tree is optimal.(15.15)Because we have probabilities of searches for each key and each dummy key, we candetermine the expected cost of a search in a given binary search tree T. Let us assume that theactual cost of a search is the number of nodes examined, i.e., the depth of the node found bythe search in T, plus 1. Then the expected cost of a search in T is(15.16)where depthT denotes a node's depth in the tree T. The last equality follows from equation(15.15). In Figure 15.7(a), we can calculate the expected search cost node by node:node depth probability contributionk1k2k3k4k5d0d110212220.150.100.050.100.200.050.100.300.100.150.200.600.150.30node depth probability contributiond2d3d4d5Total33330.050.050.050.100.200.200.200.402.80For a given set of probabilities, our goal is to construct a binary search tree whose expectedsearch cost is smallest.
We call such a tree an optimal binary search tree. Figure 15.7(b)shows an optimal binary search tree for the probabilities given in the figure caption; itsexpected cost is 2.75. This example shows that an optimal binary search tree is not necessarilya tree whose overall height is smallest. Nor can we necessarily construct an optimal binarysearch tree by always putting the key with the greatest probability at the root.
Here, key k5 hasthe greatest search probability of any key, yet the root of the optimal binary search tree shownis k2. (The lowest expected cost of any binary search tree with k5 at the root is 2.85.)As with matrix-chain multiplication, exhaustive checking of all possibilities fails to yield anefficient algorithm. We can label the nodes of any n-node binary tree with the keys k1, k2, ...,kn to construct a binary search tree, and then add in the dummy keys as leaves.
In Problem 124, we saw that the number of binary trees with n nodes is Ω(4n/n3/2), and so there are anexponential number of binary search trees that we would have to examine in an exhaustivesearch. Not surprisingly, we will solve this problem with dynamic programming.Step 1: The structure of an optimal binary search treeTo characterize the optimal substructure of optimal binary search trees, we start with anobservation about subtrees. Consider any subtree of a binary search tree. It must contain keysin a contiguous range ki, ..., kj, for some 1 ≤ i ≤ j ≤ n. In addition, a subtree that contains keyski, ..., kj must also have as its leaves the dummy keys di-1, ..., dj.Now we can state the optimal substructure: if an optimal binary search tree T has a subtree T′containing keys ki, ..., kj, then this subtree T′ must be optimal as well for the subproblem withkeys ki, ..., kj and dummy keys di-1, ..., dj.