1610906232-7ba7dbaea13262b50a6029d682cb7f1b (824370), страница 70
Текст из файла (страница 70)
,u,,} and a family F = {51, ... ,5m } of subsets of U. We shall construct an instance T(U, F) of KN APSACK, that is, nonnegative integers a1, ... , ak,and another K such that there is a subset P ~ {I, ... , k} with L:iE P ai = K ifand only if there is a set of sets C ~ F that are disjoint and collectively coverall of U.This construction is particularly simple, because it relies on an unexpectedrelationship between set union and integer addition. Subsets of a set of n elements, such as those in F, can be represented as strings over {O, l}n (see Figure7-8). In turn such strings can be interpreted as integers between zero and 211 -1,written in binary. Now taking the union of such sets, provided that they are disjoint, is the same as adding the corresponding integers. Since in EXACT COVERwe are asking whether the disjoint union of the teams makes up the whole U,this seems to be the same as asking whether there are integers among the givenones that add up to K = 1 + 2 + 4 + ...
+ 2 n - 1 -the binary number with nones. And this is very close to an instance of KNAPSACK.Proof: That KNAPSACK is ina1, ... ,an,51 = {u3, u 4 }51 = 001 152 = {u 2, u3, u 4 }52 = 01 1 100110111+ 110053 = {u j ,u2}53 = 1 1001111VV(c)(b)(a)Figure 7-8: From sets (a) to bit vectors (b) to integer addition in base m (c)There is one problem with this simple reduction: The close correspondencebetween set union and integer addition breaks down because in integer additionwe may have carry. Consider, for example, the sum 11 + 13 + 15 + 24 = 63; inbinary OOlOll + OOllOI + 001111 + OllOOO = llllll.
If we translate back tosubsets of {Uj, ... ,U6}' the sets {U3,U5,U6}, {U3,U4,U6}, {U3,U4,U5,U6}, and{U2,U3} are neither disjoint, nor do they cover all of U. In other words, carrymakes the translation between union and addition faulty.This problem can be very easily resolved as follows: Instead of consideringthe strings in {O, l}n as integers in binary, consider them as integers in m-ary,326Chapter 7: NP-COMPLETENESSwhere m is the number of sets in:1". That is, we have m integers aI, ...
,am,where ai = L:UjESi m j - 1 . We ask whether there is a subset that adds up toK = L:7=1 m j - 1. This way carry is not a problem, because the addition offewer than m digits in m-ary, with each of the digits either 0 or 1, can neverresult in carry. It is now clear that the resulting instance of KNAPSACK has asolution if and only if the original instance of EXACT COVER has a solution .•Corollary: PARTITION and TWO-MACHINE SCHEDULING are NP-complete.Proof: There are polynomial reductions from KNAPSACK to both of these problems; recall Example 7.1.2 .•We next turn to three graph-theoretic problems introduced in Section 6.2:INDEPENDENT SET, CLIQUE, and NODE COVER.Theorem 7.3.6: INDEPENDENT SET is NP-complete.Proof: It is clearly in NP; and we shall reduce 3-SATISFIABILITY to it.Suppose that we are given a Boolean formula F with clauses C1 , ...
, C m ,each with at most three literals. In fact, we shall assume that all clauses of Fhave exactly three literals; if a clause has only one or two literals, then we allowa literal to be repeated to bring the total number to three. We shall constructan undirected graph G and an integer K such that there is a set of K nodes inG with no edges between them if and only if F is satisfiable.The reduction is illustrated in Figure 7-9. For each one of the clausesC 1 , ...
,Cm of F, we have three nodes in G, connected by edges so that they forma triangle -call the nodes of the triangle corresponding to clause Ci Cil, Ci2, Ci3.These are all the nodes of G -a total of 3m nodes. The goal is K = m, equalto the number of clauses. For defining the remaining edges of G, node Cij isidentified with the jth literal of clause C i . Finally, two nodes are joined by anedge if and only if their literals are the negation of one another. This completesthe description of the reduction; see Figure 7-9 for an example.Suppose that there is an independent set I in G with K = m nodes. Sinceany two nodes from the same triangle are connected by an edge, evidently there isexactly one node in I from each triangle.
Recall that nodes correspond to literals.Consider now the fact that a node is in I to mean that the corresponding literalis T. Since there are no edges between nodes in I, it follows that no two suchliterals are the negation of one another, and therefore they can be the basis ofa truth assignment T. Notice that T may not be fully defined on all variables,because the set of nodes in I may fail to involve all variables; for example, inFigure 7-9 the independent set indicated by the full circles does not determinethe value of variable X3.
T may take any truth value on such "missing" variables;the resulting truth assignment T satisfies all clauses, because each clause has at7.3: More NP-complete Problems327Figure 7-9least one literal satisfied by T. And conversely, given any truth assignmentsatisfying F, we can obtain an independent set of size m by picking for eachclause a node corresponding to a satisfied literal .•The NP-completeness of two other graph-theoretic problems is now immediate:Theorem 7.3.7: CLIQUE and NODE COVER are NP-complete.Proof: They are both clearly in NP.Figure 7-10CLIQUE, requiring that all edges between any two nodes in the set be present,is in some sense the exact opposite of INDEPENDENT SET. The reduction makesthis sense precise. Given an instance (G, K) of INDEPENDENT SET, where G ~V x V is an undirected graph and K 2': 2 is the goal, we create an equivalentinstance (QI, K') of CLIQUE by just taking G' = V x V - {(i, i) : i E V} - G,and keeping the same goal, K' = K.
This works because, as it is fairly easy tocheck, the maximum independent set of G is precisely the maximum clique in328Chapter 7: NP-COMPLETENESSthe complement of G, the graph that contains all non-loop edges that are not inG (see Figure 7-10).Finally, NODE COVER is the exact opposite of INDEPENDENT SET in a different senSe: since the nodes in a node cover N <; V "hit" between them alledges, the set V - N must have no edges between its elements, and is thus anindependent set (see Figure 7-11). Hence, N <; V is a node cover of G if andonly if V - N is an independent set of G. Thus, the maximum independent setof G has size K or more if and only if the minimum node cover of G has sizeIVI - K or less.
The reduction from INDEPENDENT SET to NODE COVER leavesthe graph the same, and simply replaces K by IVI - K .•Figure 7-11Finite AutomataOur last NP-completeness result concerns some of the first, and apparentlysimplest, mathematical objects studied in this book: nondeterministic finiteautomata and regular expressions.Among all problems we introduced in the last chapter, there are only twowhose membership in NP is not obvious: EQUIVALENCE OF REGULAR EXPRESSIONS, and the closely related problem EQUIVALENCE OF NONDETERMINISTICFINITE AUTOMATA.
Given two regular expressions, what would be a convincingcertificate of their equivalence? Nothing succinct comes to mind.If, however, we defined the complement problemINEQUIVALENCE OF REGULAR EXPRESSIONS: Given two regular expressionsRI and R 2 , is L(Rd =I- L(R2)?then certificates seem to become possible: A certificate of the inequivalence oftwo regular expressions is a string belonging to the language generated by onebut not the other, that is, any element of (L(Rd - L(R2)) U (L(R2) - L(Rd).Indeed, this set is nonempty if and only if the expressions are inequivalent.But now the real difficulty reveals itself: Such certificates are legitimate inall respects except for the crucial polynomial succinctness property.
Given tworegular expressions RI and R 2 , there is no obvious polynomial upper bound onthe length of the shortest string that belongs in (L(Rd - L(R2)) U (L(R2) L(Rd). For such a bound to qualify, it should be polynomial in the length3297.3: More NP-complete Problemsof the two expressions, IRII + IR21 -that is, the number of symbols, such asa, b, U, *, and parentheses, needed to represent them. In fact, there are familiesof pairs of inequivalent regular expressions which differ only in strings that areexponentially long in the size of the expressions!To obtain a problem in Np we must look at a restricted special case: *-freeregular expressions, that is, regular expressions over union and concatenation,not containing any occurrences of Kleene star. Consider a *-free regular expression, such asR = (0 U 1)00(0 U 1) U 010(0 U 1)0.It is now easy to see that, if x is a string in the language generated by it (say,x = 1001 for R above), then Ixl :::; IRI.
As a result of this observation, thefollowing problem is in NP:INEQUIVALENCE OF *-FREE REGULAR EXPRESSIONS: Given two *-free regular expressions RI and R2, is L(Rd =I- L(R2)?A valid certificate is any string in (L(Rd - L(R2» U (L(R2) - L(Rd), and allsuch strings are succinct, shorter than IRII + IR 2 1. For an analogous problem inthe domain of nondeterministic finite automata, see Problem 7.3.7.In fact, we can prove this result:Theorem 7.3.8: INEQUIVALENCE OF *-FREE REGULAR EXPRESSIONS is NPcomplete.Proof: We have already argued that the problem is in NP.
We shall show thatSATISFIABILITY reduces to INEQUIVALENCE OF *-FREE REGULAR EXPRESSIONS.Given any Boolean formula with Boolean variables Xl, ... ,X n and clausesC I , ... ,em, we shall produce two regular expressions RI and R2 over the alphabet ~ = {O, I}, neither of which contains an application ofthe Kleene star, suchthat L(Rd =I- L(R 2) if and only if the given Boolean formula is satisfiable.The second regular expression, R 2 , is very simple:(0 U 1)(0 U 1)··· (OU 1),with the expression (0 U 1) repeated n times.
The language generated by R2 isobviously the set of all binary strings of length n, that is to say, L(R2) = {O, 1 }n.Now for the construction of R I . In contrast to R 2 , RI depends heavily on thegiven Boolean formula. In particular, RI is the union of m regular expressionswhere regular expressionof n regular expressions:D:idepends on clause C i . EachD:iis the concatenationChapter 7: NP-COMPLETENESS330where0,D:ij={~6UI),if Xj is a literal of Gi ;if Xj is a literal of Gi ;otherwise.If we disregard for a moment the distinction between°-1 and T - 1., thenstrings in {O, I}n can be thought of as truth assignments to the Boolean variables {Xl,' ..